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SINGLE-CHAIN RECOMBINANT COMPLEXES OF HEPATITIS C 
VIRUS NS3 PROTEASE AND NS4A COFACTOR PEPTIDE 

5 

This filing claims priority from Provisional U.S. Patent 
Applications USSN 60/067,315, filed November 28, 1997 and USSN 
60/094,331, filed July 28, 1998, each of which is incorporated herein by 
reference. 

10 

BACKGROUND OF THE INVENTION 

Hepatitis C virus (HCV) is considered to be the major etiological 
agent of non-A non-B (NANB) hepatitis, chronic liver disease, and 

15 hepatocellular carcinoma (HCC) around the world, with an estimated 
human seroprevalence of 1% globally. [Alter et al, 1994, Gastroenterol. 
Clin. North Am. 23:437-455; Behrens et al, 1996, EMBO J. 15:12-22]. Four 
million individuals may be infected in the United States. The viral 
infection accounts for greater than 90% of transfusion-associated 

20 hepatitis in the U.S. and it is the predominant form of hepatitis in adults 
over 40 years of age. Almost all of the infections result in chronic 
hepatitis and nearly 20% of those infected develop liver cirrhosis. 

The virus particle has not been identified due to the lack of an 
efficient ex vivo replication system and the extremely low amount of 

25 HCV particles in infected liver tissues or blood. However, molecular 
cloning of the viral genome has been accomplished by isolating the 
messenger RNA (mRNA) from the serum of infected chimpanzees and 
preparing cDNA using recombinant methodologies. [Grakoui A. et al., 
2993, /. Virol 67: 1385-1395]. It is now known that HCV contains a 

30 positive strand RNA genome comprising approximately 9400 

nucleotides, organization of which is similar to that of flaviviruses and 
pesti viruses. The genome of HCV, a (+)-stranded RNA molecule of ~9.4 
kb, encodes a single large polyprotein of about 3000 amino acids which 
undergoes proteolysis to form mature viral proteins in infected cells. 
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Cell-free translation of the viral polyprotein and cell culture 
expression studies have established that the HCV polyprotein is 
processed by cellular and viral proteases to produce the putative 
structural and nonstructural (NS) proteins. At least ten mature viral 

5 proteins are produced from the polyprotein by specific proteolysis. The 
order and nomenclature of the cleavage products are as follows: NH2-C- 
El-E2-p7-NS2-NS4A-NS3-NS4B-NS5A-NS5B-COOH (Fig. 1) [Grakoui et 
al, 1993, /. Virol 67:1385-95; Hijikata et al, 1991, PNAS 88:5547-51; Lin et 
al, 1994, /. Virol 68:5063-73]. The three amino-terminal putative 

10' structural proteins, C (capsid), El, and E2 (two envelope glycoproteins), 
are believed to be cleaved by a host signal peptidase of the endoplasmic 
reticulum (ER). The host enzyme is also responsible for generating the 
amino terminus of NS2. The proteolytic processing of the nonstructural 
proteins are carried out by the viral proteases: NS2-3 and NS3, contained 

15 within the viral polyprotein. The NS2-3 protease catalyzes the cleavage 
between NS2 and NS3. It is a metalloprotease and requires both NS2 and 
the protease domain of NS3. 

The NS3 protease catalyzes the rest of the cleavages in the 
nonstructural part of the polyprotein. The NS3 protein contains 631 

20 amino acid residues and is comprised of two enzymatic activities: the 
protease domain contained within amino acid residues 1-181 and a 
helicase ATPase domain contained within the rest of the protein Kim et 
al, 1995, Biochem Biophys Res. Comm., 215:160-166. It is not known if 
the 70 kD NS3 protein is cleaved further in infected cells to separate the 

25 protease domain from the helicase domain, although no cleavage has 
been observed in cell culture expression studies. 

The NS3 protease is a member of the serine class of enzymes. It 
uses a His, Asp, Ser catalytic triad. Mutation of the Ser residue abolishes 
cleavage of NS3/4A, NS4A/4B, NS4B/5A, and NS5A/5B substrates. The 

30 cleavage between NS3 and NS4A is intramolecular, whereas the 
cleavages at the NS 4A/4B, 4B/5A, 5A/5B sites occur in trans . 
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Experiments using transient expression of various forms of HCV 
NS polyproteins in mammalian cells have established that the NS3 
serine protease is necessary but not sufficient for efficient processing of 
all of these cleavages. Like the flaviviruses, the HCV NS3 protease also 

5 requires a cofactor to catalyze some of these cleavage reactions. Efficient 
proteolytic processing at NS3/4A, NS4A/4B, NS4B/5A, and NS5A/5B 
sites within the non-structural domain of hepatitis C virus requires a 
heterodimeric complex of the NS3 serine protease and the NS4A 
protein. [Bartenschlager et al 1995, /. Virol 67:3835-3844; Failla et al, 

10 1994, /. Virol 68:3753-3760]. A 13-amino acid synthetic NS4A peptide, 
corresponding to the central hydrophobic domain of NS4A protein, 
spanning residues 21-33 has been shown to be sufficient for activation of 
NS3 protease [Butkiewicz et al, 1996, Virology, 225: 328-338]. A smaller 
domain (amino acid residues 22-30) of NS4A has been shown to be 

15 sufficient for activation of the protease [Lin et al, 1995, /. Virol 69:4377- 
80]. 

The recently published three dimensional structure of the NS3 
protease [Kim et al, 1996, Cell 87:343-355; Love et al, 1996, Cell 87:331-342] 

revealed that the N-terminal 37 residues of NS3 adopt a (5 (residues 6-9)- 

20 a (residues 14-22)-(i (residues 33-37) structure upon binding of a synthetic 
peptide corresponding to the central hydrophobic domain spanning 
residues 21-32 of NS4A protein. 

Production of an active NS3i-i8i-NS4A peptide complex at 
present involves two steps. First, the NS3 catalytic domain (amino acid 

25 residues 1-181) is produced as a recombinant protein in E. coli. Next, a 
13-19 residue NS4A peptide spanning the central hydrophobic domain of 
the full-length NS4A protein is added to form a non-covaient complex 
[Kim et al, 1996, Cell 87:343-355]. This complex, although more active 
than the protease alone, is approximately 8-10 fold less active than the 

30 full-length NS3i-63i-NS4Ai-54 form of the protease as judged by its 
proteolytic activity toward a synthetic substrate based on the native 
NS5A-NS5B amino acid sequence. [Urbani et al, 1997, /. Biol Chem., 
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272(14):9204-09; Steinkuhler et al, 1996, /. Virol 70(10):6694-6700]. 
Moreover, NS4A peptide has been shown to have a very low affinity (10 
\xM) for NS3 in solution [Bianchi et al,. 1997, Biochemistry 36: 7890-7897], 
requiring addition of NS4A peptide in the high micromolar range to 

5 insure a 1:1 stoichiometric complex with NS3 protease. The limited 

solubility of this peptide in aqueous buffer due to its hydrophobic nature 
makes working with this peptide at these concentrations difficult. 

Because the HCV NS3 protease cleaves the non-structural HCV 
proteins necessary for HCV replication, the NS3 protease can be a target 

10 for the development of therapeutic agents against the HCV virus. The 
gene encoding the HCV NS3 protein has been cloned as disclosed in U.S. 
Patent No. 5,371,017. To date, however, the protease has not been 
produced in a covalent complex with the NS4A cofactor in a soluble, 
active and stable form. Such a complex would be useful as a target in a 

15 high throughput screen to discover therapeutic agents. A stable, active 
HCV protease is also required for determination of modes of binding of 
inhibitors by NMR, for structural determination by NMR spectroscopy, 
for crystallography, and for virtually all biophysical and biochemical 
studies interested in the activated form of the enzyme. 

20 

S UMMARY QF THg INVENTION 

The present invention provides NS4A tethered forms of the HCV 
25 NS3 protease comprising single-chain recombinant covalent complexes 
of Hepatitis C virus NS3 protease and an NS4A cofactor peptide which 
require no subsequent addition of NS4A peptide for activation and 
which are as active as the full-length NS3i-63i NS4A1-54. The covalent 
NS4A-NS3 complexes of the invention are more soluble, stable and 
30 active than the non-covalent protease-peptide complexes previously 
available. 

The NS4A tethered forms of the HCV NS3 protease of the 
invention consist of covalent NS4A-NS3 complexes comprising a 
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central hydrophobic domain of the NS4A peptide tethered by linker of at 
least about 4 amino acid residues to the amino terminus of the serine 
protease domain of NS3. The amino acid sequences of 20 such 
embodiments are defined in the Sequence Listing by SEQ ID NOs: 1-20. 
5 Corresponding nucleotide sequences are provided in SEQ ID NOs: 91- 
111. 

Preferred embodiments of the invention also provide NS4A 
tethered forms of the full length NS3 protease. The amino acid 
sequences of 8 such embodiments are defined in SEQ ID NOs: 11-18. 

10 Other preferred embodiments of the invention further provide 

mutant forms of the covalent NS4A-NS3 complexes in which point 
mutations introduced at positions 17 and /or 18 of the NS3 domain 
change a hydrophobic amino acid residue to a hydrophilic residue. This 
further improves the solubility of the complexes and provides the 

15 protein in a monodispersed form. The amino acid sequences of 13 such 
embodiments are defined in the Sequence Listing by SEQ ID NOs: 2-4, 6- 
8, 10, 12-14, and 16-18. 

The invention still further provides mutant forms of the covalent 
NS4A-NS3 complexes in which a mutation introduced at position 139 of 

20 the NS3 domain changes a serine residue to an alanine residue. The 
amino acid sequences of 9 such embodiments are defined in SEQ ID 
NOs: 5-8, 15-18 and 20. 

The invention further provides covalent HCV NS4A-NS3 
complexes having an easily removable histidine tag comprising three or 

25 more histidine residues fused to the complex. This enables rapid 
purification of the protease with easy removal of the tag following 
purification. 

The present invention further provides for isolated nucleic acids 
and vectors which encode the covalent NS4A-NS3 complexes of the 
30 present invention, and host cells transformed or transfected by said 
nucleic acids or vectors. 
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The invention still further provides methods for making the 
covalent NS4A-NS3 complexes comprising culturing the transformed or 
transfected host cell under conditions in which the nucleic acid or vector 
is expressed. 

5 The invention also provides methods for identifying inhibitors of 

HCV NS3. Methods are provided for detecting inhibitors of the protease 
activity, the helicase activity and the ATPase activity of NS3 using the 
disclosed covalent complexes. 

10 

BRIEF PESCRIPTIQN OF THE FIGURES 
Figure 1 schematically depicts the HCV genome. 

15 Figure 2 depicts the recombinant synthesis of plasmid pHIS-NS4A2i-32- 
GSGS-NS3 3 -i8l. 

Figure 3 depicts the recombinant synthesis of plasmid pHIS-NS3i-63i. 

20 Figure 4 depicts the recombinant synthesis of plasmid pHIS-NS4A2i-32- 
GSGS-NS33-631. 

Figures 5 A and 5B schematically depict a high throughput assay for 
discovering HCV protease inhibitors using surface plasmon resonance 
25 technology. Figure 5A illustrates the outcome expected in the absence of 
an uninhibited HCV protease, while 5B illustrates the outcome expected 
in the presence of an active, uninhibited HCV protease. 

Figure 6 shows the nucleic acid unwinding activity of the covalent His- 
30 NS4A 2l .3 2 -GSGS-NS 3 3. 63l as compared to that of the His NS3 1 . 631 /NS4A 1 _ 54 

Figure 7 shows the ATPase activity of the covalent His-NS4A 21 . 32 -GSGS- 
NS 33 . 631 complex as monitored by thin layer chromatography. 
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D^TA^gD DESCRIPTION OF THE INVENTION 

5 The teachings of all references cited are incorporated herein in 

their entirety by reference. 

The covalent NS4A-NS3 complexes of the present invention are 
useful for structural determination and determination of mode of 
binding of HCV inhibitors by NMR spectroscopy. Moreover, they 
10 provide a more soluble and stable form of HCV NS3 protease than the 
presently available non-covalent NS3i_i8i-NS4A peptide complexes for 
crystallography studies, high throughput screening assays and other 
conventional biophysical and biochemical investigations. 

Several representative embodiments of the covalent NS4A-NS3 
15 complexes of the invention are disclosed in the examples below. In one 
such embodiment, NS4A residues 21-32 were tethered to the amino 
terminus of residues 3-181 of mature NS3 protease by a 4-residue linker, 
GSGS (SEQ ID NO: 21). The complex was overexpressed as a soluble 
protein in E. coli and purified to homogeneity by a combination of metal 
20 chelate and size-exclusion chromatography. The tethered complex, HIS- 
NS4A 2 i-32-GSGS-NS3 3 -i8i (SEQ ID NO: 1) cleaved a NS5A/5B synthetic 
substrate with a catalytic efficiency identical to that of the non-covalent 
full-length protease, NS3i-63i-NS4Ai-54. 

In other embodiments of the invention, the NS4A hydrophobic 
25 domain and the NS3 serine protease domain are covalently tethered 
using different amino acid linkers. The preferred amino acid linkers of 
the invention comprise at least about four amino acid residues. More 
preferably, the linkers consist of from four to six amino acid residues. 
More preferably, four-residue linkers are used. Most preferably, amino 
30 acid linkers having the sequence defined by SEQ ID NO: 21 or 22 are used 
to tether the NS4A hydrophobic domain and the NS3 serine protease 
domain. 

Routine procedures in the art would allow one to construct 
covalent NS4A-NS3 complexes of the invention having linkers of 
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various sizes. It will be understood by one skilled in the art, for example, 
that if smaller or larger portions of the NS3 or NS4A domains are used 
to construct the covalent complexes of the invention, longer or shorter 
amino acid linkers can be used. 
5 Other embodiments of the present invention contain smaller or 

larger portions of the NS4A cofactor peptide. In preferred embodiments, 
the complexes contain an NS4A hydrophobic domain comprising at 
least amino acid residues 22-30 of the full length NS4A cofactor peptide. 
More preferably, the complexes contain from 12-19 amino acid residues 
10 spanning the central hydrophobic domain of the full length NS4A 

peptide. Most preferably, the complexes contain amino acid residues 21- 
32 of full length NS4A peptide. 

Still further embodiments of the present invention contain 
smaller or larger portions of the NS3 protease. In preferred 
15 embodiments, the complexes contain an NS3 serine protease domain 
comprising at least amino acid residues 3-181 of the full length NS3 
protease. More preferably, the complexes contain amino acid residues 1- 
181 of full length NS3 protease. Most preferably, the complexes contain 
amino acid residues 3-181 of full length NS3 protease. 
20 The present invention thus also includes covalent NS4A-NS3 

complexes comprising the central hydrophobic domain of the NS4A 
peptide tethered to the amino terminus of fulHength mature NS3 
protease (amino acids 1-631) by an amino acid linker. The amino acid 
sequences of preferred embodiments comprising NS4A tethered to full- 
25 length mature NS3 protease are set forth in SEQ ID NOs: 11-18. 

Surprisingly, it has also been found that the introduction of point 
mutations at position 17 and /or 18 of the NS3 domain of the NS4A- NS3 
constructs of the present invention which change a hydrophobic amino 
acid residue to a hydrophilic amino acid residue produces a more soluble 
30 and mono-dispersed form of the tethered complex. Thirteen 

representative embodiments of such mutant NS4A-NS3 complexes are 
disclosed in the Examples below. In some embodiments, the isoleucine 
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at position 17 is mutated to lysine. One such mutant form is referred to 
as His-NS4A 2 i-32-GSGS-NS3 3 -i8i/I17K (SEQ ID NO: 2). In other 
embodiments, the same mutation is made at position 18. One such 
mutant form is referred to as His-NS4A 2 i-32-GSGS-NS33.i 8 i/I18K (SEQ 

5 ID NO: 3). In yet other embodiments, the mutations are introduced at 
both positions. One such mutant is referred to as His-NS4A2i-32-GSGS- 
NS33-i8l/H7K,I18K (SEQ ID NO: 4). Each of the purified mutants results 
in a monodispersed (as judged by size exclusion chromatography) and 
more soluble (as judged by achieving higher concentration of the 

10 complex 17-20 mg/ml) form of the complex, which remains 

monodispersed for a period of about one week at 4°C, while still 
exhibiting kinetic properties identical to those of the wild type. 

It will be understood that although the foregoing embodiments 
are presently preferred, other modifications to the hydrophobic residues 

15 at positions 17 and 18 can be made to produce other soluble complexes. 
Preferably, neutral amino acid residues will be substituted for charged 
residues. These modifications can be used in a number of combinations 
to produce the final modified protein chain. 

Also provided are NS4A-tethered forms of NS3 full-length 

20 domain. In contrast to the NS4A-tethered forms of the catalytic domain, 
a considerable amount of autocleavage in the helicase domain of the 
NS3 protein is detected during the purification of their native full- 
length counterpart, HIS-NS4A21-32-NS33-631. To prevent autocleavage 
of the full-length covalent complexes, the catalytic serine residue at 

25 position 139 is mutated to alanine. The amino acid sequence of one such 
embodiment is defined by SEQ ID NO: 15. The mutation of the full 
length constructs at position 139 can also be made in the NS4A-tethered 
forms of the NS3 catalytic domain, and can be made in combination 
with any of the aforementioned mutations to increase solubility and 

30 stability while preventing autocleavage. Representative embodiments 
are set forth in SEQ ID NOs: 5-8, 15-18 and 20. 
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As used herein, the terms "native NS3" and "full-length NS3" are 
used interchangeably and are defined as a protein which (a) has an 
amino acid sequence substantially identical to the sequence defined by 
SEQ ID NO: 23 and (b) has biological activity that is common to native 
5 NS3. This includes natural allelic variants and other variants having 
one or more conservative amino acid substitutions [Grantham, 1974, 
Science 185:862] that do not substantially impair biological activity. Such 
conservative substitutions involve groups of synonymous amino acids, 
e.g., as described in U.S. patent No. 5,017,691 to Lee et al 
10 The "serine protease domain" of NS3 or the "catalytic domain" of 

NS3 refers to amino acids 1-181 of mature NS3, which have been shown 
to contain the active catalytic triad His, Asp and Ser. 

The term "native NS4A peptide" as used herein is defined as a 
peptide which (a) has an amino acid sequence substantially identical to 
15 the sequence defined by SEQ ID NO: 24; and (b) has biological activity 
that is common to native NS4A. This includes natural allelic variants 
and other variants having one or more conservative amino acid 
substitution [Grantham, 1974, Science 185:862] that do not substantially 
impair biological activity. Such conservative substitutions involve 
20 groups of synonymous amino acids, e.g., as described in U.S. patent No. 
5,017,691 to Lee et al 

As used herein, the "central hydrophobic domain of NS4A 
peptide" refers to that portion of the native NS4A peptide 
(approximately amino acid residues 22 - 30) which is sufficient for 
25 activation of NS3 protease. Size and sequence variants of this domain 
which also activate the NS3 protease in the claimed complexes also fall 
within this term. 

A "soluble" covalent complex as referred to herein is defined as a 
protein which will remain in solution after a high spin centrifugation 
30 step at 300,000 x g in a standard ultracentrifuge in a buffer containing 25 
mM HEPES, pH 7.6, 10% glycerol, 0.3 M NaCl, 10 mM pME. 



SUBSTITUTE SHEET ( rule 26 ) 



WO 99/28482 PCT/US98/24528 

-11 - 

An "active" covalent complex as referred to herein is defined as a 
complex which will cleave synthetic substrates corresponding to NS5A- 
NS5B cleavage site (for example, DTEDWCC SMYTWTGK) (SEQ ID 
NO: 25)) between PI residue, cysteine and PI' residue, serine in a buffer 
5 containing 25 mM Tris, pH 7.5, 150 mM NaCl, 10 % glycerol, and 0.05 % 
lauryl maltoside. 

Nucleic acids encoding the covalent NS4A-NS3 complexes are 
also a part of this invention. DNA encoding the covalent NS4A-NS3 
complexes of this invention can be prepared by chemical synthesis 

10 using the known nucleic acid sequence [Ratner et al, 1985, Nucleic 
Acids Res. 13:5007] and standard methods such as the 
phosphoramidite solid support method of Matteucci et al, 1981, /. 
Am. Chem. Soc. 103:3185 or the method of Yoo et al, 1989, /. Biol 
Chem. 764:17078. See also Glick, Bernard R. and Pasternak, Molecular 

15 Biotechnology, pages 55 - 63, (ASM Press, Washington, D.C. 1994). 
The genes encoding the desired regions of the HCV protein can also 
be obtained using the plasmid disclosed in Grakoui, et al, 1993, /. 
Virol 67:1385-1395 or that disclosed in Takamizawa et al, 1991, /. 
Virology 65(3):1105-1113. Also, the nucleic acid encoding HCV NS3 

20 and NS4A can be isolated, amplified and cloned from patients 

infected with the HCV virus. Furthermore, the HCV genome has 
been disclosed in PCT WO 89/04669 and is available from the 
American Type Culture Collection (ATCC), 12301 Parklawn Drive, 
Rockville, MD under ATCC accession no. 40394. 

25 Of course, because of the degeneracy of the genetic code, there 

are many functionally equivalent nucleic acid sequences that can 
encode the NS3 and NS4A domains of the covalent NS4A-NS3 
complexes as defined herein. Such functionally equivalent 
sequences, which can readily be prepared using known methods such 

30 as chemical synthesis, PCR employing modified primers and site- 
directed mutagenesis, are within the scope of this invention. 
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Various vectors can be used to express DNA encoding the 
covalent NS4A-NS3 complexes. Conventional vectors used for 
expression of recombinant proteins in prokaryotic or eukaryotic cells 
may be used. Preferred vectors include the pET vectors described by 

5 Studier et al, 1990, Methods of Enzymology 185: 60-89, and the pcD 
vectors described by Okayama et al, 1983, Mol Cell Bio. 3: 280-289; and 
Takebe et al, 1988, Mol Cell Biol 8: 466-472. Other SV40-based 
mammalian expression vectors include those disclosed in Kaufman et 
al, 1982, Mol Cell Biol 2: 1304-1319 and U.S. Patent No. 4,675,285. These 

10 SV40-based vectors are particularly useful in COS7 monkey cells (ATCC 
No. CRL 1651), as well as in other mammalian cells such as mouse L 
cells and CHO cells. 

Standard transfection methods can be used to produce eukaryotic 
cell lines which express large quantities of polypeptides. Eukaryotic cell 

15 lines include mammalian, yeast and insect cell lines. Exemplary 

mammalian cell lines include COS-7 cells, mouse L cells and Chinese 
Hamster Ovary (CHO) cells. See Sambrook et al, supra and Ausubel et 
al, supra. 

As used herein, the term "transformed bacteria" means bacteria 
20 that have been genetically engineered to produce a viral or mammalian 
protein. Such genetic engineering usually entails the introduction of an 
expression vector into a bacterium. The expression vector is capable of 
autonomous replication and protein expression relative to genes in the 
bacterial genome. Construction of bacterial expression vectors is well 
25 known in the art, provided the nucleotide sequence encoding a desired 
protein is known or otherwise ascertainable. For example, DeBoer in 
U.S. Pat. No. 4,551,433 discloses promoters for use in bacterial expression 
vectors; Goeddel et al in U.S. Pat No. 4,601,980 and Riggs, in U.S. Pat. 
No. 4,431,739 disclose the production of mammalian proteins by E. coli 
30 expression systems; and Riggs supra, Ferretti et al, 1986, Proc. Natl Acad. 
Sci. 83:599, Sproat et al, 1985, Nucleic Acid Research 13:2959 and 
Mullenbach et al, 1986, /. Biol. Chem 261:719 disclose how to construct 
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synthetic genes for expression in bacteria. Many bacterial expression 
vectors are available commercially and through the American Type 
Culture Collection (ATCC), Rockville, Maryland. 

Insertion of DNA encoding the covalent NS4A-NS3 complexes 
5 into a vector is easily accomplished when the termini of both the 
DNA and the vector comprise the same restriction site. If this is not 
the case, it may be necessary to modify the termini of the DNA 
and/or vector by digesting back single-stranded DNA overhangs 
generated by restriction endonuclease cleavage to produce blunt ends, 
10 or to achieve the same result by filling in the single-stranded termini 
with an appropriate DNA polymerase. 

Alternatively, any site desired may be produced by ligating 
nucleotide sequences (linkers) onto the termini. Such linkers may 
comprise specific oligonucleotide sequences that define desired 
15 restriction sites. The cleaved vector and the DNA fragments may 
also be modified if required by homopolymeric tailing. 

Many E. coh'-compatible expression vectors can be used to 
produce soluble covalent NS4A-NS3 complexes of the present 
invention, including but not limited to vectors containing bacterial 

20 or bacteriophage promoters such as the Tac, Lac, Trp, LacUV5, X P r 

and X Pl promoters. Preferably, a vector selected will have expression 
control sequences that permit regulation of the rate of expression. 
Then, production of covalent NS4A-NS3 complexes can be regulated 
to avoid overproduction that could prove toxic to the host cells. 

25 Most preferred is a vector comprising, from 5' to 3' (upstream to 
downstream), a Tac promoter, a lac M repressor gene and DNA 
encoding mature human HCV protease. The vectors chosen for use 
in this invention may also encode secretory leaders such as the 
ompA or protein A leader, as long as such leaders are cleaved during 

30 post-translational processing to produce covalent NS4A-NS3 
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complexes or if the leaders are not cleaved, the leaders do not 
interfere with the enzymatic activity of the protease. 

The covalent complexes of the invention, or portions thereof, 
can also be synthesized by a suitable method such as by exclusive solid 
5 phase synthesis, partial solid phase methods, fragment condensation or 
classical solution synthesis. The polypeptides are preferably prepared by 
solid phase peptide synthesis as described by Merrifield, 1963, /. Am. 
Chem. Soc. 85:2149. The synthesis is carried out with amino acids that 
are protected at the alpha-amino terminus. Trifunctional amino acids 
10 with labile side-chains are also protected with suitable groups to prevent 
undesired chemical reactions from occurring during the assembly of the 
polypeptides. The alpha-amino protecting group is selectively removed 
to allow subsequent reaction to take place at the amino-terminus. The 
conditions for the removal of the alpha-amino protecting group do not 
15 remove the side-chain protecting groups. 

The alpha-amino protecting groups are those known to be 
useful in the art of stepwise polypeptide synthesis. Included are acyl 
type protecting groups (e.g., formyl, trifluoroacetyl, acetyl), aryl type 
protecting groups (e.g. , biotinyl), aromatic urethane type protecting 
20 groups [e.g., benzyloxycarbonyl (Cbz), substituted benzyloxycarbonyl 
and 9-fluorenylmethyloxy-carbonyl (Fmoc)], aliphatic urethane 
protecting groups [e.g., t-butyloxycarbonyl (tBoc), 
isopropyloxycarbonyl, cyclohexyloxycarbonyl] and alkyl type 
protecting groups (e.g*, benzyl, triphenylmethyl). The preferred 
25 protecting groups are tBoc and Fmoc, thus the peptides are said to be 
synthesized by tBoc and Fmoc chemistry, respectively. 

The side-chain protecting groups selected must remain intact 
during coupling and not be removed during the deprotection of the 
amino-terminus protecting group or during coupling conditions. 
30 The side-chain protecting groups must also be removable upon the 
completion of synthesis, using reaction conditions that will not alter 
the finished polypeptide. In tBoc chemistry, the side-chain protecting 
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groups for trifunctional amino acids are mostly benzyl based. In 
Fmoc chemistry, they are mostly tert. -butyl or trityl based. 

In tBoc chemistry, the preferred side-chain protecting groups 
are tosyl for Arg, cyclohexyl for Asp, 4-methylbenzyl (and 
5 acetamidomethyl) for Cys, benzyl for Glu, Ser and Thr, 

benzyloxymethyl (and dinitrophenyl) for His, 2-Cl-benzyloxycarbonyl 
for Lys, formyl for Trp and 2-bromobenzyl for Tyr. In Fmoc 
chemistry, the preferred side-chain protecting groups are 2,2,5,7,8- 
pentamethylchroman-6-sulfonyl (Pmc) or 2,2,4,6,7- 
10 pentamethyldihydrobenzofuran-5-sulfonyl (Pbf) for Arg, trityl for 
Asn, Cys, Gin and His, tert butyl for Asp, Glu, Ser, Thr and Tyr, tBoc 
for Lys and Trp. 

For the synthesis of phosphopeptides, either direct or post- 
assembly incorporation of the phosphate group is used. In the direct 
15 incorporation strategy, the phosphate group on Ser, Thr or Tyr may 
be protected by methyl, benzyl or tert.butyl in Fmoc chemistry or by 
methyl, benzyl or phenyl in tBoc chemistry. Direct incorporation of 
phosphotyrosine without phosphate protection can also be used in 
Fmoc chemistry. In the post-assembly incorporation strategy, the 
20 unprotected hydroxyl group of Ser, Thr or Tyr is derivatized on solid 
phase with di-tert.butyl-, dibenzyl- or dimethyl-N,N'- 
diisopropylphosphoramidite and then oxidized by 
tert.buty Hydroperoxide. 

Solid phase synthesis is usually carried out from the carboxyl- 
25 terminus by coupling the alpha-amino protected (side-chain 

protected) amino acid to a suitable solid support. An ester linkage is 
formed when the attachment is made to a chloromethyl, chlortrityl 
or hydroxymethyl resin, and the resulting polypeptide will have a 
free carboxyl group at the C-terminus. Alternatively, when an amide 
30 resin such as benzhydrylamine or p-methylbenzhydrylamine resin 
(for fBoc chemistry) and Rink amide or PAL resin (for Fmoc 
chemistry) is used, an amide bond is formed and the resulting 
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polypeptide will have a carboxamide group at the C-terminus. These 
resins, whether polystyrene- or poly amide-based or 
polyethyleneglycol-grafted, with or without a handle or linker, with 
or without the first amino acid attached, are commercially available, 
5 and their preparations have been described by Stewart et al (1984)., 
"Solid Phase Peptide Synthesis" (2nd Edition), Pierce Chemical Co., 
Rockford, IL.; and Bayer & Rapp (1986) Chem. Pept. Prot 3, 3; and 
Atherton, et al. (1989) Solid Phase Peptide Synthesis: A Practical 
Approach, IRL Press, Oxford. 
10 The C- terminal amino acid, protected at the side-chain if 

necessary and at the alpha-amino group, is attached to a 
hydroxylmethyl resin using various activating agents including 
dicyclohexylcarbodiimide (DCC), N,N'-diisopropylcarbodiimide 
DIPCDI) and carbonyldiimidazole (CDI). It can be attached to 
15 chloromethyl or chlorotrityl resin directly in its cesium 

tetramethylammonium salt form or in the presence of triethylamine 
(TEA) or diisopropylethylamine (DIEA). First amino acid 
attachment to an amide resin is the same as amide bond formation 
during coupling reactions. 
20 Following the attachment to the resin support, the alpha- 

amino protecting group is removed using various reagents 
depending on the protecting chemistry (e.g. , tBoc, Fmoc). The extent 
of Fmoc removal can be monitored at 300-320 nm or by a 
conductivity cell. After removal of the alpha-amino protecting 
25 group, the remaining protected amino acids are coupled stepwise in 
the required order to obtain the desired sequence. 

Various activating agents can be used for the coupling 
reactions including DCC, DIPCDI, 2-chloro-l,3-dimethylimidium 
hexafluorophosphate (CIP), benzotriazol-l-yl-oxy-tris- 
30 (dime thy lamino)-phosphonium hexafluorophosphate (BOP) and its 
pyrrolidine analog (PyBOP), bromo-tris-pyrrolidino-phosphonium 
hexafluorophosphate (PyBroP), O -(benzotriazol-l-yl)-l,l,3,3- 
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tetramethyluronium hexafluorophosphate (HBTU) and its 
tetrafluoroborate analog (TBTU) or its pyrrolidine analog (HBPyU), 
O -(7-azabenzotriazol-l-yl)-l,l,3,3-tetramethyluronium 
hexafluorophosphate (HATU) and its tetrafluoroborate analog 
5 (TATU) or pyrrolidine analog (HAPyU). The most common catalytic 
additives used in coupling reactions include 4- 
dimethylaminopyridine (DMAP), 3-hydroxy-3,4-dihydro-4-oxo-l,2,3- 
benzotriazine (HODhbt), N-hydroxybenzotriazole (HOBt) and 1- 
hydroxy-7-azabenzotriazole (HO At). Each protected amino acid is 
10 used in excess (>2.0 equivalents), and the couplings are usually 
carried out in N-methylpyrrolidone (NMP) or in DMF, CH2CI2 or 
mixtures thereof. The extent of completion of the coupling reaction 
can be monitored at each stage, e.g^ by the ninhydrin reaction as 
described by Kaiser et al., Anal. Biochem. 34:595 (1970). In cases where 
15 incomplete coupling is found, the coupling reaction is extended and 
repeated and may have chaotropic salts added. The coupling 
reactions can be performed automatically with commercially 
available instruments such as ABI model 430 A, 431 A and 433 A 
peptide synthesizers. 
20 After the entire assembly of the desired polypeptide, the 

polypeptide-resin is cleaved with a reagent with proper scavengers. 
The Fmoc peptides are usually cleaved and deprotected by TFA with 
scavengers (e.g., H2O, ethanedithiol, phenol and thioanisole). The 
tBoc peptides are usually cleaved and deprotected with liquid HF for 
25 1-2 hours at -5 to 0°C, which cleaves the polypeptide from the resin 
and removes most of the side-chain protecting groups. Scavengers 
such as anisole, dimethylsulfide and p-thiocresol are usually used 
with the liquid HF to prevent cations formed during the cleavage 
from alkylating and acylating the amino acid residues present in the 
30 polypeptide. The formyl group of Trp and dinitrophenyl group of His 
need to be removed, respectively, by piperidine and thiophenol in 
DMF prior to the HF cleavage. The acetamidomethyl group of Cys can 
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be removed by mercury(II) acetate and alternatively by iodine, 
thallium (III) trifluoroacetate or silver tetrafiuoroborate which 
simultaneously oxidize cysteine to cystine. Other strong acids used 
for tBoc peptide cleavage and deprotection include 
5 trifluoromethanesulfonic acid (TFMSA) and 
trimethylsilyltrifluoroacetate (TMSOTf). 

Recombinant DNA methodology can also be used to prepare 
the polypeptides. The known genetic code, tailored if desired with 
known preferred codons for more efficient expression in a given host 
10 organism, can be used to synthesize oligonucleotides encoding the 
desired amino acid sequences. The phosphoramidite solid support 
method of Matteucci et al ,/. Am. Chem. Soc. 103:3185 (1981) or other 
known methods can be used for such syntheses. The resulting 
oligonucleotides can be inserted into an appropriate vector and 
15 expressed in a compatible host organism. 

The polypeptides of the invention can be purified using HPLC, gel 
filtration, ion exchange and partition chromatography, countercurrent 
distribution or other well known methods. In a preferred embodiment 
of the present invention the covalent NS4A-NS3 complexes also contain 
20 a histidine tag which facilitates purification using a Ni + column as is 
illustrated below. 

One can use the covalent NS4A-NS3 complexes of the invention, 
along with known synthetic substrates, to develop high throughput 
assays. These can be used to screen for compounds which inhibit 
25 proteolytic activity of the protease. This is carried out by developing 

techniques for determining whether or not a compound will inhibit the 
covalent NS4A-NS3 complexes of the invention from cleaving the viral 
substrates. Examples of such synthetic substrates are set forth in SEQ ID 
NOs 25 and 93. If the substrates are not cleaved, the virus cannot 
30 replicate. One example of such a high throughput assay is the 

scintillation proximity assay (SPA). SPA technology involves the use of 
beads coated with scintillant. Bound to the beads are acceptor molecules 
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such as antibodies, receptors or enzyme substrates which interact with 
ligands or enzymes in a reversible manner. 

For a typical protease assay the substrate peptide is biotinylated at 
one end and the other end is radiolabeled with low energy emitters 

5 such as 125 I or 3 H. The labeled substrate is then incubated with the 
enzyme. Avidin coated SPA beads are then added which bind to the 
biotin. When the substrate peptide is cleaved by the protease, the 
radioactive emitter is no longer in proximity to the scintillant bead and 
no light emission takes place. Inhibitors of the protease will leave the 

10 substrate intact and can be identified by the resulting light emission 
which takes place in their presence. 

Another type of protease assay, utilizes the phenomenon of 
surface plasmon resonance (SPR). A novel, high throughput enzymatic 
assay utilizing surface plasmon resonance technology has been 

15 successfully developed. Using this assay, and a dedicated BIAcore™ 
instrument, at least 1000 samples per week can be screened for either 
their enzymatic activity or their inhibitory effects toward the enzymatic 
activity, in a 96 well plate format. This methodology is readily adaptable 
to any enzyme-substrate reaction. The advantage of this assay over the 

20 SPA assay is that it does not require a radiolabeled peptide substrate. 

EXAMPLES 

Several covalent NS4A-NS3 complexes have been constructed, 
purified, characterized and assayed for activity based on a cDNA clone 
containing an HCV Japanese (lb/BK) strain whose sequence is published 
25 in Takamizawa et al, 1991, /. Virology 65:1105-1113. DNA sequencing of 
the clone (BK 138-1) revealed four amino acid differences with the 
published sequence, at positions 66 (A->G), 86 (P->Q), 87 (K->A) and 147 
(F->S) of the NS3 protein. 

The present invention can be illustrated by the following non- 
30 limiting examples. 
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Reagents and General Methods 

Plasmid pHCV-lb/BK can be derived from DNA fragments 
containing the entire DNA sequence of HCV BK cDNA as reported by 
Takamizawa et al, 1991, /. Virology 65:1105-1113, with the above- 
5 mentioned changes. Plasmid pMD-34-2 is derived from that portion of 
the disclosed DNA sequence which encodes NS3 residues 1-631 from 
HCV BK cDNA. 

Restriction Enzymes, Vent Polymerase and ThermoPol buffer 
were obtained from New England Biolabs (Beverly, MA). The 

10 QuickChange mutagenesis kit and dNTP's were obtained from 

Stratagene (Lajolla, CA). Ready-to-Go T4 DNA Ligase was obtained from 
Pharmacia Biotech (Piscataway, NJ). Oligonucleotide primers were 
synthesized by Genosys Biotechnologies (Woodland, Texas). DNA 
sequencing was performed according to the Sanger-Dideoxy method by 

15 Bioserve Biotechnologies (Laurel, MD). pET vectors and BL21(DE3) cells 
were obtained from Novagen (Madison, WI). PCR reactions were 
carried out in a Perkin Elmer Cetus, model 480 DNA thermocycler. 

DH5a cells and TAE buffer were purchased from Gibco, BRL. GTG 
agarose was purchased from FMC corporation. The Qiaquick gel 
20 extraction kit and Qiaquick PCR purification kit were purchased from 
Qiagen Inc. (Chatsworth, CA). 

Standard DNA recombinant DNA methods were carried out 
essentially as described by Sambrook et. al. in "Molecular Cloning: A 

Laboratory Manual/ 7 2 nd edition, 1989, Cold Springs Harbor Press, 
25 Plainview, New York. 
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Preparation of NS4A-Tethered Forms of HCV NS3 Protease 

Native, NS4A-tethered forms of NS3 catalytic domain 

Various NS4A-tethered forms of the NS3 catalytic domain were 
constructed by joining the NS4A peptide GSVVIVGRIILS (NS4A amino 
5 acids 21-32) to the amino terminus of NS3 amino acids 3-181 via 

various three or four residue linkers, and were cloned into the pET-28b+ 
vector. 

Single stranded oligonucleotide primers were designed to 
generate a 616 base pair PCR fragment containing an Ndel site followed 

10 by the NS4A peptide, a linker, and amino acids 3-181 of the NS3 catalytic 
domain at the 5' terminus and a stop codon flanked by an EcoRI site at 
the 3' terminus. The template used was the sequence disclosed in 
Takamizawa, et al, 1991, J. Virology 65(3):1105-1113, which contains the 
entire HCV genome from the lb/BK strain, except for the four 

15 differences described above. Other sources for HCV DNA can be used in 
the disclosed methods, including plasmid pBRTM/HCV 1-3011 (Grakoui 
et al, 1993), which contains the entire genome from the la strain. 

Vent DNA polymerase was utilized to amplify the DNA by PCR. 
Primers were diluted in dH 2 0 to give a final concentration of 50 |ig/ml. 
20 The template was diluted in dH 2 0 to give a final concentration of 10 
ng/^il; The dNTP's (GTP, ATP, CTP, GGT) were diluted toa 
concentration of 10 mM (2.5 mM each) in dH20. 

100 \i\ reactions were prepared for PCR in a 500 ul Eppendorf tube 
by addition of the following reagents: 74 ^1 of dH20, 10 ul of the lOx 
25 Thermopol buffer (final lx buffer: 10 mM KCL, 20 mMTris-HCL (pH 8.8), 
2mM MgS0 4 and 0.1% Triton X), 10 |ul1 of template (100 ng), 2 (il of the 5' 
primer (100 ng); 1 ^1 of the 3' primer (50 ng), 2 \i\ of the dNTP mixture 

(200 |iM) and 1 |il of Vent polymerase enzyme (1 unit). The mixture was 
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then overlayed with 20 ul of immersion oil and placed in the 
thermocycler for amplification. The PCR conditions were as follows: 95 

°C for 45 seconds (1 cycle); 95 °C for 30 seconds, 55 °C for 1 minute, 72 °C 
for 2 minutes (25 cycles). 

5 The amplified 616 base pair fragment was purified in preparation 

for restriction digestion using a Qiaquick PCR purification kit according 
to the manufacturer's protocol without modification. Briefly, the 
aqueous layer was removed and placed in a 1.5 ml Eppendorf tube with a 
regent that aids the DNA to bind to a column matrix. The DNA was 

10 washed while bound to the column and then eluted with 43 \il of H20. 
The DNA was then double digested with EcoRI and Ndel in a 50 ul 
volume for 1 hour at 37 °C The reaction took place in a 1.5 ml 
polypropylene Eppendorf tube with 5 |il of lOx EcoRI buffer (final 
concentration of 50mM NaCl, 100 mM Tris-HCL, lOmM MgCl 2/ 0.25% 

15 Triton X-100, pH 7.5) and \il 1 of EcoRI and Ndel (20 units). The pET-28b+ 
vector (3 ^g) was also digested using the same conditions. The digests 
were further purified by resolving them on a 1.0 % agarose 
electrophoresis gel for 45 minutes under 100 volts. They were rendered 
visible with 0.5 Jig/ ml of ethidium bromide, excised with a scalpel under 

20 short-wave UV, solubilized and purified using the QIAquick gel 
extraction kit according to manufacturer's protocol without 
modifications. The fragments were quantitated by visually comparing a 
5 ul aliquot of the purified fragment versus Lambda Hind/III DNA 
standards on a 1% agarose gel. Approximately 200 ng of vector and 50 ng 

25 of PCR fragment were ligated together in a 20 ul volume for 18 hours at 
16 degrees. They were combined together in a T4 ligase (Ready-to-Go) 
reaction tube according to standard protocol without modifications. 

2 \il of this mixture was then used to transform 50 \x\ of DH5ct 
cells for plasmid propagation according to manufacturer's protocol. 
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Briefly, a 1.5 ml Eppendorf tube was placed on ice and 50 ul of DH5a cells 
(previously stored at -80°C and then thawed on ice immediately prior to 
use) were added to the tube along with the 2 ul of ligation mixture and 
allowed to incubate for 30 minutes. They were then heat shocked for 1 
5 minute at 42°C, returned to the ice for 2 minutes and then regenerated 
with 500 pi of SOC medium and incubated at 37°C for 1 hour at 300 rpm. 

200 |il of these cells were then plated out on LB/20-10-5 agar (per 
liter: tryptone 50 grams, yeast extract 25 grams, NaCl 12.5 gram) with 
kanamycin (25 ^g/ml), spread for single colony isolation and incubated 

10 at 37 °C overnight. Three single colonies were selected for plasmid 
preparations. They were inoculated into 100 mis of LB/20-10-5 broth 
with kanamycin (25 Hg/ml) in a 250 ml baffled flask and grown 
overnight for 18 hours at 37 degrees at 300 RPM in a shaker. The next 
day, the cultures were spun down in 500 mi Nalgene centrifuge bottles 

15 (8000 RPM, 10 minutes, 4 °C) and the pellet was harvested for plasmid 
isolation. The Qiagen midi-prep kit was used according to 
manufacturer's protocol. The DNA was quantitated using a UV/VIS 
spectrophotometer (Perkin-Elmers) at 260 nm. The purified, plasmid- 
DNA isolates were sequenced on an Applied Biosystems 373A DNA 

20 sequencer at Bioserve Biotechnologies, Inc. To confirm the sequence, 
both top and bottom strands were sequenced via primers that were 
synthesized by Bioserve Biotechnologies. 

Native, NS4A-tethered forms of NS3 full-length domain 

Both parental plasmids, HIS-NS4A 2 i.32-GSGS-NS3 3 .63i and HIS- 
25 NS4A 2 i-32-GSGS-NS33. 63 i /S139A parental plasmids were created via a cut 

and paste method. Briefly, 5 ni of plasmid PMD34-2 (l^g), plasmid HIS- 
NS4A 21 . 3 2-GSGS-NS3 3 -i8i (5 jig) and plasmid HIS-NS3iWS139A dlig) 
were each digested separately in a 1.5 ml Eppendorf tube with 5 ul of 
NEB buffer #2 (at final concentration of lOmM Tris-HCL, lOmM MgCl 2 , 



SUBSTITUTE SHEET ( rule 26 ) 



WO 99/28482 



-24- 



PCT/US98/24528 



50mM Nad, ImM DTT, pH 7.9), 0.5 ^1 of acetylated BSA (final 
concentration 100 ^g/ml), 1 \i\ of Xbal (2 Units) and 38.5 ^1 of ddH 2 0. 

These digests were incubated at 37 °C for one hour at which time 
2.5 ^1 of 2M NaCl (final concentration of 150mM) 45 (J.1 of ddH 2 0 and 2.5 
5 |il of BspMI (2 Units) were added to the digests and incubated for 2 more 

hours at 37 °C. The double digests were then resolved on 0.8 % agarose 
gels and the size and quantity of the fragments were determined. The 
agarose gels were electorphoresed in BioRad apparatus and the 
fragments were excised using a scalpel. The excised backbone fragments 
10 which were derived from PMD34-2 and HIS-NSS^i/SlSgA were each 
7.1 KB and the insert from HIS-NS4A 2 i.32-GSGS-NS3 3 -i8i was 275 base 
pairs. Approximately 2 \il of 7.1 KB backbone (200 ng) and 1 \i\ of 225 bp 
insert (50 ng) were ligated together in a 20 |il volume for 18 hours at 16 
°C. They were combined together in a T4 ligase (Ready- to-Go) reaction 

15 tube according to standard protocol without modifications. 2 |nl of this 

mixture was then used to transform 50 \il of DH5a cells for plasmid 
propagation according to manufacturer's protocol. 

Three single colonies of each construct were selected for miniprep 
plasmid isolations using a Qiagen miniprep kit. They were inoculated 

20 into 5 mis of LB/20-10-5 broth with ampicillin (100 ^ig/ml) in a 15 ml 
tubes and grown overnight for 18 hours at 37°C at 300 RPM in a shaker. 
The next day, the cultures were spun down 3000 RPM, 10 minutes, 4°C 
and the pellet was harvested for plasmid isolation. The clones were 
then assessed for recombination by digesting with BspMI and Xbal 

25 according to the conditions described above. The digests were resolved 
on a 1% agarose gel and only those constructs yielding a 225 bp and 7.1 
KB bp fragment were chosen as positives. Cultures from the positive 
clones were inoculated into 100 mis of LB/20-10-5 broth with ampicillin 
(100 ug/ml) in a 250 ml baffled flask and grown overnight for 18 hours 
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at 37°C at 300 RPM in a shaker. The next day, the cultures were spun 
down in 500 ml Nalgene centrifuge bottles (8000 RPM, 10 minutes, 4°C) 
and the pellet was harvested for plasmid isolation. The Qiagen midi- 
prep kit was used according to manufacturer's protocol. The DNA was 
5 quantitated using a UV/VIS spectrophotometer (Perkin-Elmers) at 260 
nm. The purified plasmid-DNA isolates were sequenced at the 
restriction site junctions on an Applied Biosystems 373A DNA 
sequencer at Bioserve Biotechnologies, Inc. 

Site-directed Mutants. 

10 All site-directed mutations created in either NS4A-tethered forms 

of catalytic or full-length domain of NS3 protease were carried out using 
the quikchange site-directed mutagenesis kit (Stratagene) according to 
the manufacturer's protocol. For each mutation, two oligonucleotide 
primers (10 picomoles each) containing the desired mutation were used 

15 to amplify the entire plasmid encompassing the NS4 A- tethered NS3 
protease gene (50 or 100 ng/reaction) using pfu DNA polymerase (2.5 

units /reaction) in a final reaction volume of 50 pi The PCR conditions 

were as follows: 95 °C for 45 seconds (1 cycle); 95 °C for 30 seconds, 55 °C 

for 1 minute, 68 °C for 15 minutes (16 cycles). After amplification, the 

20 reaction mixture was treated with 1 ul of Dpnl (1 Unit) for 1 hour at 37 °C 
in order to digest the parental DNA. 

One microliter of this digest was used to transform 50 |xl of XLI 
Blue cells to repair nicks and propagate the mutated plasmid. Plasmid- 
DNA were purified and transformed into BL21 (DE3) cells for expression 
25 studies. 
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EXAMPLE 1 

NS3 Catalytic Domain Constructs 

i. HIS-NS4A 2 i-32-GSGS-NS3 3 . 18 i (SEQ ID NO: 1) 

HIS-NS4A 2 i-32-GSGS-NS33_i 8 i was constructed by joining amino 
5 acids 21-32 of the NS4A peptide to the N-terminal domain of NS3 

protease (NS3 amino acids 3-181) via the linker GSGS (SEQ ID NO: 21), 
and was cloned into the pET-28b+ vector as described above. The 5' 
primer reads as follows: 



5'GATATACATATGGGTTCTGTTGTTATTGTTGGTAGAATTATTTTATCT 
10 GGTAGTGGTAGTATCACGGCCTACTCCCAA 3' (SEQ ID NO:26). 

The 3' primer reads as follows: 

5' CTCAGCGAATTCTCAAGACCGCATAGTAGTTTCCAT 3' (SEQ ID 
NO:27). 

ii. HIS-NS4A 21 . 32 -GSGS-NS3 3 .i8i/I17K (SEQ ID NO: 2) 

A single amino acid mutant of HIS-NS4A 2 i-32-GSGS-NS3 3 _i 81 was 
constructed by creating a point mutation at position 17 of the NS3 
domain of HIS-NS4A 21 _ 32 -GSGS-NS33, 18 i construct as described above. 
Two oligonucleotide primers, each complementary to opposite strands 
of the template weregenerated which contain the point mutation which 
alters amino acid number 17 (isoleucine) to a lysine. The top strand 
primer was as follows: 

5'CGGGGCCTACTTGGTTGCAAGATCACTAGCCTTACAGGC 3' 
(SEQ ID NO:28). 

The bottom strand read as follows: 
25 5' GCCTGTAAGGCTAGTGATCTTGCAACCAAGTAGGCCCCG 3' 

(SEQ ID NO: 29). 



15 



20 
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The template, HIS-NS4A 2 i_ 3 2-GSGS-NS3 3 _ 181 , along with these two 
primers, were utilized in a PCR reaction to generate the point mutation. 

5 (iii) HIS-NS4A 21 . 3 2-GSGS-NS33WI18K (SEQ ID NO: 3) 

A single amino acid mutant of HIS-NS4A 2 i-32-GSGS-NS33-i8i was 
constructed by creating a point mutation at position 18 of the NS3 
domain of HIS-NS4A 2 i-32-GSGS-NS33-i8i construct as described above. 
Two oligonucleotide primers, each complementary to opposite strands 
10 of the template, were generated which contain the point mutation 
which alters amino acid number 18 (isoleucine) to a lysine. The top 
strand primer was as follows: 

5' CGGGGCCTACTTGGTTGCATCAAGACTAGCCTrACAGGC 3' 
(SEQ ID NO: 30). 
15 The bottom strand reads as follows: 

5' GCCTGTAAGGCTAGTCTTGATGCAACCAAGTAGGCCCCG 3' 
(SEQ ID NO: 31). 

The template, HIS-NS4A21.32-GSGS-NS33.i8i, along with these two 
primers was utilized in a PCR reaction to generate the point mutation. 

20 (iv) HIS-NS4A2x.32-GSGS-NS33.i8i/H7K, I18K (SEQ ID NO: 4) 

A single amino acid mutant of HIS-NS4A 2 i-32-GSGS-NS33.i 8 i/I18K 
was constructed by creating a point mutation at position 17 of the NS3 
domain of HIS-NS4A 21 . 3 2-GSGS-NS3 3 . 18 i/I18K construct as described 
above. Two oligonucleotide primers, each complementary to opposite 
25 strands of the template, were generated which contain the point 

mutation which alters amino acid number 18 (isoleucine) to a lysine. 
The top strand primer was as follows: 
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5' CGGGGCCTACTTGGTTGCAAGAAGACTAGCCTTACAGGC 3' 
(SEQ ID NO:32). 
The bottom strand read as follows: 

5' GCCTGTAAGGCTAGTCTTCTTGCAACCAAGTAGGCCCCG 3'. 

5 (SEQIDNO:33) 

The template HIS-NS4A21.32-GSGS-NS33.1a1/Il8K, along with these two 
primers, was utilized in a PCR reaction to generate the point mutation. 

v . HIS-NS4A 21 . 32 -GSGS-NS33. 18 i/S139A (SEQ ID NO: 5) 

10 A single amino acid mutant of HIS-NS4A2i-32-GSGS-NS33.i 8l was 

constructed by creating a point mutation at position 139 of the NS3 
domain of HIS-NS4A2i-32-GSGS-NS3 3 .i 8 i construct as described above. 
Two oligonucleotide primers, each complementary to opposite strands 
of the template, were generated which contain the point mutation 

15 which alters amino acid number 139 (catalytic serine) to an alanine. The 
top strand primer was as follows: 

5' CTCCTACTTGAAGGGCTCTGCTGGTGGTCCACTGCTCTGC 3' 
(SEQIDNO:34). 
The bottom strand reads as follows: 
20 5' GCAGAGCAGTGGACCACCAGCAGAGCCCTTCAAGTAGGAG 3' 

(SEQ ID NO:35). 

The template HIS-NS4A 2 i.32-GSGS-NS33.i 8 i, along with these two 
primers, was utilized in a PCR reaction to generate the point mutation. 

25 
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vi. HIS-NS4A 21 . 3 2-GSGS-NS3 3 WS139A, I17K (SEQ ID NO: 6) 

A single amino acid mutant of HIS-NS4A 21 _ 32 -GSGS-NS3 3 . 
i8i/S139A was constructed by creating a point mutation at position 17 of 
the NS3 domain of HIS-NS4A 2 i-32-GSGS-NS33-i8i/ S139A construct as 
5 described above. Two oligonucleotide primers, each complementary to 
opposite strands of the template, were generated which contain the point 
mutation which alters amino acid number 17 (isoleucine) to a lysine. 
The top strand primer was as follows: 

5' CGGGGCCTACTTGGTrGCAAGATCACTAGCCTTACAGGC 3' 
10 (SEQIDNO:36). 

The bottom strand reads as follows: 

5' GCCTGTAAGGCTAGTGATCTTGCAACCAAGTAGGCCCCG 3' 
(SEQIDNO:37). 

The template, HIS-NS4A 21 _ 32 -GSGS-NS3 3 . 18 i/S139A, along with these two 
15 primers, was utilized in a PCR reaction to generate this point mutation. 

vii. HIS-NS4A 21 _ 32 -GSGS-NS3WS139A, I18K (SEQ ID NO: 7) 

A single amino acid mutant of HIS-NS4A 2 i- 3 2-GSGS-NS3 3 . 
i8i/S139A was constructed by creating a point mutation at position 18 of 
the NS3 domain of HIS-NS4A 2 i. 32 -GSGS-NS3 3 . 18 i/S139A construct as 
20 described above. Two oligonucleotide primers, each complementary to 
opposite strands of the template, were generated which contain the point 
mutation which alters amino acid number 18 (isoleucine) to a lysine. 
The top strand primer was as follows: 

5' CGGGGCCTACTTGG1TGCATCAAGACTAGCCTTACAGGC 3' 
25 (SEQIDNO:38). 
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The bottom strand reads as follows: 

5' GCCTGTAAGGCTAGTCTTGATGCAACCAAGTAGGCCCCG 3' 
(SEQ ID NO:39). 

The template, HIS-NS4A 2 i.32-GSGS-NS3 3 . 181 /S139A along with these two 
5 primers was utilized in a PCR reaction to generate this point mutation. 

viii. HIS-NS4A 21 . 32 -GSGS-NS33. a81 /S139A, I17K, I18K (SEQ ID NO. 8) 

A single amino acid mutant of HIS-NS4A 2 i. 32 -GSGS-NS3 3 . 
i8i/S139A, I17K was constructed by creating a point mutation at position 
18 of the NS3 domain of HIS-NS4A 21 . 32 -GSGS-NS33-i8i/S139A,I17K 
10 construct as described above. Two oligonucleotide primers, each 
complementary to opposite strands of the template were generated 
which contain the point mutation which alters amino acid number 18 
(isoleucine) to a lysine. The top strand primer was as follows: 

5' CGGGGCCrACrTGGTTGCAAGAAGACTAGCCTTACAGGC 3' 
15 (SEQ ID NO: 40). 

The bottom strand reads as follows: 

5' GCCTGTAAGGCTAGTCTTGATGCAACCAAGTAGGCCCCG 3' 
(SEQ ID NO: 41). 

The template, HIS-NS4A 21 . 32 -GSGS-NS3 3 . 18 i/S139A,I17K, along with 
20 these two primers, was utilized in a PCR reaction to generate this point 
mutation. 

ix. HIS-NS4A 21 . 32 -PAGG-NS3 M81 (SEQ ID NO: 9) 

An NS4A-tethered form of the NS3 catalytic domain, HIS-NS4A 21 . 
32 -PAGG-NS3;}.i8i, was constructed by joining the NS4A peptide 
25 GSWTVGRIILS (NS4A amino acids 21-32) to the N-terminal domain of 
NS3 protease (NS3 amino acids 3-181) via the linker PAGG (SEQ ID NO: 
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22), and was cloned into the pET-28b+ vector as described above. Primers 
were designed to generate a 616 base pair PCR fragment containing an 
Ndel site followed by the NS4A peptide, the PAGG linker, and amino 
acids 3-181 of the NS3 catalytic domain at the 5' terminus and a stop 
5 codon flanked by an EcoRI site at the 3' terminus. The 5' primer reads as 
follows: 

5' GATATACATATGGGTTCTGTTGTTATTGTTGGTAGAATTATTrr 

ATCTCCTGCTGGTGGTATCACGGCCTACTCCCAA 3' (SEQ ID NO: 42). 

The 3' primer reads as follows: 

10 5' CTCAGCGAATTCTCAAGACCGCATAGTAGTTTCCAT 3' (SEQ ID NO: 

43). 

Plasmid-DNA (PMD-34-2), comprised of pET-22b+ vector encompassing 
the gene insert encoding HIS-NS3 (1-631) from lb/BK strain was used as 
the template for PCR. 

15 x. HIS-NS4A 21 . 32 -PAGG-NS33. 181 /I17K (SEQ ID NO: 10) 

A single amino acid mutant of HIS-NS4A 2 i-3 2 -PAGG-NS3 3 .i8i was 
constructed by creating a point mutation at position 17 of the NS3 
domain of the HIS-NS4A 21 _ 32 -PAGG-NS3 3 .i8i construct as described 
above. Two oligonucleotide primers, each complementary to opposite 
20 strands of the template, were generated which contain the point 

mutation which alters amino acid number 17 (isoleucine) to a lysine. 
The top strand primer was as follows: 

5' CGGGGCCTACTTGGTTGCAAGATCACTAGCCTTACAGGC 3' 
(SEQ ID NO: 44). 
25 The bottom strand reads as follows: 
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5' GCCTGTAAGGCTAGTGATCTTGCAACCAAGTAGGCCCCG 3' 
(SEQIDNO: 45). 

The template, HB-NS4A21.3rPAGG-NS33.i8i. along with these two 
primers was utilized in a PCR reaction to generate this point mutation. 

5 xi. HIS-NS4A 21 _32-PAG-NS3 3 . 181 (SEQ ID NO: 46) 

A NS4A-tethered form of the NS3 catalytic domain, HIS-NS4A 2 i- 
32-PAG-NS33.181, was constructed by joining the NS4A peptide 
GSWIVGRIILS (NS4A amino acids 21-32) to the N-terminal domain of 
NS3 protease (NS3 amino acids 3-181) via the linker PAG (SEQ ID NO: 

10 47), and was cloned into the pET-28b+ vector as described above. Primers 
were designed to generate a 613 base pair PCR fragment containing an 
Ndel site followed by the NS4A peptide, the PAG linker, and amino 
acids 3-181 of the NS3 catalytic domain at the 5' terminus and a stop 
codon flanked by an EcoRI site at the 3' terminus. The 5' primer reads as 

15 follows: 

5' GATATACATATGGGTTCTGTTGTTATTGTTGGTAGAATTATTTT 
ATCTCCTGCTGGTATCACGGCCTACTCCCAA 3' (SEQ ID NO: 48). 
The 3' primer reads as follows: 

5' CTGAGCGAATTCTCAAGACCGCATAGTAGTTTCCAT 3' 

20 (SEQ ID NO: 49). 

Plasmid-DNA (PMD-34-2), comprised of pET-22b+ vector encompassing 
the gene insert encoding HIS-NS3 (1-631) from lb/BK strain was used as 
the template for PCR. 

xiL HIS-NS4A 21 . 3 2-PAG-NS3 3 . 181 /I17K (SEQ ID NO: 50) 

25 A single amino acid mutant of HIS-NS4A 2 i-32-PAG-NS33.i 81 

was constructed by creating a point mutation at position 17 of the NS3 
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domain of HIS-NS4A 2 i.32-PAG-NS3 3 .i8i construct as described above. 
Two oligonucleotide primers, each complementary to opposite strands 
of the template were generated which contains the point mutation 
which alters amino acid residue number 17 (isoleucine) to a lysine. The 
5 top strand primer was as follows: 

5' CGGGGCCTACTTGGTTGCAAGATCACTAGCCTTACAGGC 3' 
(SEQ ID NO: 51). 
The bottom strand reads as follows: 

5' GCCTGTAAGGCTAGTGATCTTGCAACCAAGTAGGCCCCG 3' 
10 (SEQ ID NO: 52). 

The template, HIS-NS4A 2 i.32-PAG-NS3 3 . 18 i along with these two primers 
were utilized in a PCR reaction to generate this point mutation. 

xiii. HIS-NS4A 21 _ 32 -GGS-NS3 3 .i 81 (SEQ ID NO: 53) 

An NS4A-tethered form of NS3 catalytic domain, HIS-NS4A 2 i. 3 2- 
15 GGS-NS33.181 was constructed by joining the NS4A peptide 

GSWIVGRIILS (NS4A amino acids 21-32) to the N-terminal domain of 
NS3 protease (NS3 amino acids 3-181) via the linker GGS (SEQ ID NO: 
54), and was cloned into the pET-28b+ vector as described above. Primers 
were designed to generate a 613 base pair PCR fragment containing an 
20 Ndel site followed by the NS4A peptide, the GGS linker, and amino 
acids 3-181 of the NS3 catalytic domain at the 5' terminus and a stop 
codon flanked by an EcoRI site at the 3' terminus. The 5' primer reads as 
follows: 

5' GATATACATATGGGTTCTGTTGTTATTGTTGGTAGAATTATTTT 
25 ATCTGGTGGTTCTATCACGGCCTACTCCCAA 3' (SEQ ID NO: 55). 

The 3' primer reads as follows: 

5' CTCAGCGAATTCTCAAGACCGCATAGTAGTTTCCAT 3' 
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(SEQIDNO: 56). 

Plasmid-DNA (PMD-34-2), comprised of pET-22b+ vector encompassing 
the gene insert encoding HIS-NS3 (1-631) from lb/BK strain was used as 
the template for PCR. 

5 xiv. HIS-NS4A 21 . 32 -GGS-NS3 3 . 181 /I17K (SEQ ID NO: 57) 

A single amino acid mutant of HIS-NS4A 21 _ 3 2-GGS-NS3 3 . 18 i was 
constructed by creating a point mutation at position 17 of the NS3 
domain of HIS-NS4A 2 i_ 32 -GGS-NS3 3 .i 8 i construct as described above. 
Two oligonucleotide primers, each complementary to opposite strands 
10 of the template, were generated which contain the point mutation 
which alters amino acid number 17 (isoleucine) to a lysine. The top 
strand primer was as follows: 

5' CGGGGCCTACTTGGTTGCAAGATCACTAGCCTTACAGGC 3' 
(SEQIDNO: 58). 
15 The bottom strand reads as follows: 

5' GCCTGTAAGGCTAGTGATCTTGCAACCAAGTAGGCCCCG 3' 
(SEQ ID NO: 59). 

The template, HIS-NS4A 21 . 3 2-GGS-NS3 3 . 18 i, along with these two 
primers, was utilized in a PCR reaction to generate this point mutation. 

20 

EXAMPLE 2 
NS3 Full-l ength Constructs 

L HIS-NS3 1 ^ 31 /I17K (SEQ ID NO: 60) 

A single amino acid mutant of HIS-NS3i^ 3 i was formed by 
25 creating a point mutation at position 17 of NS3 protease using the 

Plasmid-DNA (PMD-34-2), comprised of pET-22b+ vector encompassing 
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the gene insert, encoding HIS- NS3i.63i from lb/BK strain as described 
above. Two oligonucleotide internal primers, each complementary to 
opposite strands of the template, were generated which contain the point 
mutation which alters amino acid number 17 (isoleucine) to a lysine. 
5 The top strand primer was as follows: 

5' CGGGGCCTACTTGGTTGCAAGATCACTAGCCTTACAGGC 3' 
(SEQIDNO: 61). 
The bottom strand reads as follows: 

5' GCCTGTAAGGCTAGTGATCTTGCAACCAAGTAGGCCCCG 3' 
10 (SEQIDNO: 62). 

The template, plasmid-DNA (PMD-34-2), comprised of pET-22b+ vector 
encompassing the gene insert, encoding HIS-NS3i^3i from lb/BK strain, 
along with these two primers was utilized in a PCR reaction to generate 
this point mutation. 

15 ii. HIS-NS3i^ 31 /I18K (SEQ ID NO: 63) 

A single amino acid mutant of HIS-NS3i^ 3 i was formed by 
creating a point mutation at position 18 of NS3 protease using the 
Plasmid-DNA (PMD-34-2), comprised of pET-22b+ vector encompassing 
the gene insert, encoding HIS-NS3i^ 3 i from lb/BK strain as described 
20 above. Two oligonucleotide internal primers, each complementary to 
opposite strands of the template, were generated which contain the point 
mutation which alters amino acid number 18 (isoleucine) to a lysine. 
The top strand primer was as follows: 

5' CGGGGCCTACTTGGTTGCATCAAGACTAGCCTTACAGGC 3' 
25 (SEQ ID NO: 64). 

The bottom strand reads as follows: 

5' GCCTGTAAGGCTAGTCTTGATGCAACCAAGTAGGCCCCG 3' 
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(SEQ ID NO: 65). 

The template, plasmid-DNA (PMD-34-2), comprised of pET-22b+ vector 
encompassing the gene insert, encoding HIS-NS3i^3i from lb/BK strain 
along with these two primers was utilized in a PCR reaction to generate 
5 this point mutation. 

iii. HIS-NS3XWS139A (SEQ ID NO: 66) 

A single amino acid mutant of HIS'NS3i^ 31 was formed by 
creating a point mutation at position 139 of the NS3 protease using the 
Plasmid-DNA (PMD-34-2), comprised of pET-22b+ vector encompassing 
10 the gene insert, encoding HIS-NS3i_6 3 i from lb/BK strain as described 
above. Two oligonucleotide internal primers, each complementary to 
opposite strands of the template, were generated which contain the point 
mutation which altered amino acid number 139 (catalytic serine) to an 
alanine. The top strand primer was as follows: 

15 5' CTCCTACTTGAAGGGCTCTGCTGGTGGTCCACTGCTCTGC 3' 

(SEQ ID NO: 67). 
The bottom strand reads as follows: 

5' GCAGAGCAGTGGACCACCAGCAGAGCCCTTCAAGTAGGAG 3' 
(SEQ ID NO: 68). 

20 The template, plasmid-DNA (PMD-34-2), comprised of pET-22b+ vector 
encompassing the gene insert, encoding HIS-NS3i. 6 3i from lb/BK strain 
along with these two primers was utilized in a PCR reaction to generate 
this point mutation. 

iv. HIS-NS3!WI403S (SEQ ID NO: 69) 

25 A single amino acid mutant of HIS-NS3i-63i was formed by 

creating a point mutation at position 403 of the NS3 protease using the 
Plasmid-DNA (PMD-34-2), comprised of pET-22b+ vector encompassing 
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the gene insert, encoding HIS-NS3i^3i from Ib/BK strain as described 
above. Two oligonucleotide internal primers, each complementary to 
opposite strands of the template, were generated which contain the point 
mutation which alters amino acid number 403 (isoleucine) to a serine. 
5 The top strand primer was as follows: 

5' GTCCGTCATACCAACTTCCGGAGACGTCGTTGTCG 3' 
(SEQ ID NO: 70). 
The bottom strand reads as follows: 

5' CGACAACGACGTCTCCGGAAGTTGGTATGACGGAC 3' 

10 (SEQ ID NO: 71). 

The template, plasmid-DNA (PMD-34-2), comprised of pET-22b+ vector 
encompassing the gene insert, encoding HIS-NS3i. 6 3i front lb/BK strain 
along with these two primers was utilized in a PCR reaction to generate 
this point mutation. 

15 v. HIS-NS3!WNdeI (SEQ ID NO. 72) 

A silent mutant of HIS-NS3i_63i was formed to eliminate the 
internal Ndel restriction site within NS3 protease using the Plasmid- 
DNA (PMD-34-2), comprised of pET-22b+ vector encompassing the gene 
insert, encoding HIS-NS3i^ 31 from lb/BK strain as described above. Two 
20 oligonucleotide internal primers, each complementary to opposite 

strands of the template, were generated which contain point mutations 
which alters the codons on the reading strand of alanine 217 from GCA 
to GCC and tyrosine 218 from TAT to TAC. The top strand primer was as 
follows: 

25 5' ACTAAAGTGCCGGCTGCCTACGCAGCCCAAGGG 3' 

(SEQ ID NO: 73). 
The bottom strand reads as follows: 
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5' CCCTTGGGCTGCGTAGGCAGCCGGCACTTTAGT 3' 
(SEQ ID NO: 74). 

The template, plasmid-DNA (PMD-34-2), comprised of pET-22b+ vector 
encompassing the gene insert encoding HIS-NS3i, 63 i from lb/BK strain, 
5 along with these two primers, was utilized in a PCR reaction to generate 
this point mutation. 

vi. HIS-NS4A 21 .32-GSGS-NS3^ 1 (SEQ ID NO: 4) 

An NS4A-tethered form of the NS3 full-length domain, HIS- 
NS4A2i-32-GSGS-NS3 3 -63i, was constructed via a cut and paste strategy as 

10 described above. Briefly, a 270 bp fragment was generated by restricting 
HIS-NS4A 21 _3 2 -GSGS-NS33_ 18 i with Xbal/BspMI; This fragment 
encompassed sequences encoding a histidine tag followed by a thrombin 
site, the NS4A peptide, GSVVIVGRIILS (NS4A amino acids 21-32), the 
linker GSGS (SEQ ID NO: 21) and NS3 amino acids 3-48. A second 7111 

15 fragment (7111 bp) was generated by restricting Plasmid-DNA (PMD-34- 
2), comprised of pET-22b+ vector encompassing the gene insert, 
encoding HIS-NS3 (1-631) from lb/BK strain with Xbal/Bspml resulting 
in a fragment encompassing the pET 22b+ vector backbone in addition to 
amino acids 49- 631. These two fragments were then ligated together 

20 with T4 DNA ligase to form HIS-NS4A 2 i. 32 -GSGS-NS3 3 .63i. 

vii. HIS-NS4A 21 . 32 -GSGS-NS33WI17K (SEQ ID NO: 12) 

A single amino acid mutant of HIS-NS4A 2 i. 32 -GSGS-NS3 3 -63i was 
constructed by creating a point mutation at position 17 of the NS3 
25 domain of HIS-NS4A 2 i-32-GSGS-NS3 3 -63i construct as described above. 
Two oligonucleotide primers, each complementary to opposite strands 
of the template, were generated which contain the point mutation 
which alters amino acid number 17 (isoleucine) to a lysine. The top 
strand primer was as follows: 
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5' CGGGGCCTACTTGGTTGCAAGATCACrAGCCTTACAGGC 3' 
(SEQ ID NO: 75). 
The bottom strand read as follows: 

5' GCCTGTAAGGCTAGTGATCTTGCAACCAAGTAGGCCCCG 3' 
5 (SEQIDNO:76). 

The template, HIS-NS4A 21 . 32 -GSGS-NS33.63i along with these two 
primers was utilized in a PCR reaction to generate this point mutation. 

viii. HIS-NS4A 21 _ 32 -GSGS-NS3 3 WI18K (SEQ ID NO: 13) 

A single amino acid mutant of HIS-NS4A 2 i_3 2 -GSGS-NS3 3 _ 6 3i was 
10 constructed by creating a point mutation at position 18 of the NS3 
domain of HIS-NS4A 2 i. 32 -GSGS-NS3 3 _63i construct as described above. 
Two oligonucleotide primers, each complementary to opposite strands 
of the template were generated which contained the point mutation 
which alters amino acid number 18 (isoleucine) to a lysine. The top 
15 strand primer was as follows: 

5' CGGGGCCTACTTGGTTGCATCAAGACTAGCCTTACAGGC 3' 
(SEQ ID NO: 77). 
The bottom strand reads as follows: 

5' GCCTGTAAGGCTAGTCTTGATGCAACCAAGTAGGCCCCG 3' 

20 (SEQ ID NO: 78). 

The template, HIS-NS4A 21 . 32 -GSGS-NS3 3 . 6 3i, along with these two 
primers was utilized in a PCR reaction to generate this point mutation. 

ix. HIS-NS4A 21 . 32 -GSGS-NS3 3 ^ 31 /I17K / I18K (SEQ ID: 14) 

A double amino acid mutant of HIS-NS4A 2t _3 2 -GSGS-NS3 3 . 6 3i was 
25 constructed by creating 2 point mutations at positions 17 and 18 of the 
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NS3 domain of the HIS-NS4A 21 _3 2 -GSGS-NS3 3 -63i construct 
simultaneously as described above. Two oligonucleotide primers, each 
complementary to opposite strands of the template, were generated 
which contain the point mutations which alter amino acid numbers 17 
5 (isoleucine) and 18 (isoleucine) to lysines. The top strand primer was as 
follows: 

5' Cggggcctacttggttgcaagaagactagccttacaggc 3' 
(SEQIDNO:79). 
The bottom strand read as follows: 
10 5' GCCTGTAAGGCTAGTCTTCTTGCAACCAAGTAGGCCCCG 3' 

(SEQ ID NO: 80). 

The template, HIS-NS4A21.32-GSGS-NS33.631, along with these two 
primers, was utilized in a PCR reaction to generate this point mutation. 

x. HIS-NS4A 21 . 3r GSGS-NS33^ 31 /S139A (SEQ ID NO: 15) 

15 An NS4A-tethered form of NS3 full-length domain, HIS-NS4A 2 i- 

32-GSGS-NS33- 6 3i/S139A, was constructed via a cut and paste strategy as 
described above. Briefly, a 290 bp fragment was generated by restricting 
HIS-NS4A21.32-GSGS-NS33.x8i with Xbal/BspMI; this fragment encompass 
sequence encoding a histidine tag, a thrombin site, amino acids 21-32 of 

20 the the NS4A peptide, the linker GSGS (SEQ ID NO. 21) and NS3 amino 
acids 3-48. A second 7111 fragment (7111 bp) was generated by restricting 
HIS-NS3 1 . 631 /S139A construct with Xbal/Bspml resulting in a fragment 
encompassing the pET 22b+ vector backbone in addition to amino acids 
49- 631. These two fragments were then ligated together with T4 DNA 

25 ligase to form HIS-NS4A 2 i.32-GSGS-NS3 3 WS139A. 
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xi. HIS-NS4A 21 . 3 2-GSGS-NS3 3 WS139A, I17K (SEQ ID NO: 16) 

A single amino acid mutant of HIS-NS4A 2 i.32-GSGS-NS33. 
63i/S139A was constructed by creating a point mutation at position 17 of 
the NS3 domain of the HIS-NS4A 2 i-32-GSGS-NS3 3 -63i/S139A construct as 
5 described above. Two oligonucleotide primers, each complementary to 
opposite strands of the template, were generated which contain the point 
mutation which alters amino acid number 17 (isoleucine) to a lysine. 
The top strand primer was as follows: 

5' CGGGGCCTACTrGGTrGCAAGATCACTAGCCTTACAGGC 3' 
10 (SEQ ID NO: 81). 

The bottom strand is as follows: 

5'GCCTGTAAGGCTAGTGATCTTGCAACCAAGTAGGCCCCG 3' 
(SEQ ID NO: 82). 

The template HIS-NS4A2i.32-GSGS-NS33.63i/S139A, along with these two 
15 primers, was utilized in a PCR reaction to generate this point mutation. 

xii. HIS-NS4An.32-GSGS-NS33.63i/S139A, I18K (SEQ ID NO: 17) 

A single amino acid mutant of HIS-NS4A 2 i-32-GSGS-NS33- 
63i/S139A was constructed by creating a point mutation at position 18 of 
the NS3 domain of the HIS-NS4A 2 i. 3 2-GSGS-NS3 3 ^3i/S139A construct as 
20 described above. Two oligonucleotide primers, each complementary to 
opposite strands of the template, were generated which contain the point 
mutation which alters amino acid number 18 (isoleucine) to a lysine. 
The top strand primer was as follows: 

5' CGGGGCCTACTTGGTTGCATCAAGACTAGCCTTACAGGC 3' 
25 (SEQ ID NO: 83). 

The bottom strand read as follows: 
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5' GCCTGTAAGGCTAGTCTTGATGCAACCAAGTAGGCCCCG 3' 
(SEQ ID NO: 84). 

The template HIS-NS4A2i-3 2 -GSGS-NS33-63i/S139A, along with these two 
primers, was utilized in a PCR reaction to generate this point mutation. 

5 xiiL HIS-NS4A 21 _3 2 -GSGS-NS3 3 WS139A, I17K, I18K (SEQ ID NO: 18) 

A single amino acid mutant of fflS-NS4A 2 i-32-GSGS-NS3 3 _ 6 3 1 /S139A / 
I17K was constructed by creating a point mutation at position 18 of the 
NS3 domain of the HIS-NS4A2i.32-GSGS-NS33.63i/S139A, I17K construct 
as described above. Two oligonucleotide primers, each complementary 
10 to opposite strands of the template, were generated which contain the 
point mutation which alters amino acid number 18 (isoleucine) to an 
lysine. The top strand primer was as follows: 

5' CGGGGCCTACTTGGTTGCAAGAAGACTAGCCTTACAGGC 3' 
(SEQ ID NO: 85). 
15 The bottom strand reads as follows: 

5' GCCTGTAAGGCTAGTCTTCTTGCAACCAAGTAGGCCCCG 3' 
(SEQ ID NO: 86). 

The template HIS-NS4A 21 _ 3r GSGS-NS33- 6 3i/ S139A,I17K, along with 
these two primers was utilized in a PCR reaction to generate this point 
20 mutation. 

xiv. HIS-NS4A 15 .32-GSGS-NS3 3 . 6 3 1 (SEQ ID NO: 19) 

A NS4A-tethered form of NS3 full-length domain, HIS-NS4A 2 i- 3 2- 
GSGS-NS33.631 was constructed by joining the amino acids 15-32 of NS4A 
peptide to the N-terminal end of the NS3 protease (NS3 amino acids 3- 
25 631) via the linker GSGS, and was cloned into the pET-28b+ vector as 

described above with the following modification. Primers were designed 
to generate a PCR fragment containing an Ndel site followed by the 
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NS4A peptide, the GSGS linker (SEQ ID NO: 21), and amino acids 3-631 
of the NS3 catalytic domain at the 5' terminus and a stop codon flanked 
by an EcoRI site at the 3' terminus. The 5' primer sequence was as 
follows: 

5 5'GATATACATATGGCTITACTCTCTGACTACGGGTTCrGTTGTTATr 

GTTGGTAGAATTATTTTATCTGGTAGTGGTAGTATCACGGCCTACTCCCAA 3' 
(SEQ ID NO: 87). 
The 3' primer sequence was as follows: 

1 0 5' GTGGTGGTGCTCGAGGCTGCCGCGCGGCA 

CCAGCGTAACGACCTCCAGGTC 3' (SEQ ID NO: 88). 

The template used was HIS-NS4A 21 _ 32 -GSGS-NS3 :W 3 1 . The resulting 
PCR fragment was 1974 bases. Vent DNA polymerase was employed 
and a final concentration of 200 dNTPS was used. The PCR 

15 conditions were as follows: 95 °C for 45 seconds (1 cycle); 95 °C for 30 

seconds, 55 °C for 1 minute, 72 °C for 2 minutes (25 cycles). The product 
was purified with QIAquick PCR kit (Qiagen). This PCR product, along 
with the 6.6 kb vector backbone (HIS-NS4A 2 i.32-GSGS-NS3 3 .63i), were 
double digested with Ndel and BamHI. The digested fragments of 1.43 
20 and 6.6 Kbp respectively were run on agarose gel, excised, and column 
purified with QIAquick gel extraction kit (Qiagen). They were 
quantitated and then ligated together with T4 DNA ligase. 

xv.HIS-NS4A 15 . 32 -GSGS-NS3^ 31 /S139A (SEQ ID NO: 20) 

25 An NS4A-tethered form of NS3 full-length domain, HIS-NS4A 21 . 

32 -GSGS-NS3 3 -63i/S139A was constructed by joining amino acids 15-32 of 
the NS4A peptide to the N-terminal end of the NS3 protease (NS3 
amino acids 3-631) via the linker GSGS (SEQ ID NO: 21), and was cloned 
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into the pET-28b+ vector as described above with the following 
modification. Primers were designed to generate a PCR fragment 
containing an Ndel site followed by the NS4A peptide, the GSGS linker 
(SEQ ID NO: 21), and amino acids 3-631 of the NS3 catalytic domain at 
5 the 5' terminus and a stop codon flanked by an EcoRI site at the 3' 
terminus. The 5' primer sequence was as follows: 

5'GATATACATATGGCTTACTCTCTGACTACGGGTTCTGTTGTTATT 
GTTGGTAGAATTATTTTATCTGGTAGTGGTAGTATCACGGCCTACTCCCAA 3' 
(SEQ ID NO: 89). 
10 The 3' primer reads as follows: 

5' TGGTGGTGCTCGAGGCTGCCGCGCGGCACCAGCGTAACGACCT 

CCAGGTC 3' (SEQ ID NO: 90). 

The template used was HIS-NS4A 2 i-32-GSGS-NS33-63i/S139A. The 
resulting PCR fragment was 1974 bases. Vent DNA polymerase was 
employed and a final concentration of 200 |iM dNTPS was used. The 

PCR conditions were as follows: 95 °C for 45 seconds (1 cycle); 95 °C for 

30 seconds, 55 °C for 1 minute, 72 °C for 2 minutes (25 cycles). The 
product was purified with QIAquick PCR kit (Qiagen). This PCR product 
along with the 6.6 kb vector backbone (HIS-NS4A 2 i. 3 2-GSGS-NS3 3 ^3i) 
were double digested with Ndel and BamHI. The digested fragments of 
1.43 and 6.6 Kbp respectively were run on agarose gel, excised, and 
column purified with QIAquick gel extraction kit (Qiagen). They were 
quantitated and then ligated together with T4 DNA ligase. 



15 



20 
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EXAMPLE 3 

Expression and Purification of HCV NS4A-NS3 Complexes 

A. Small Scale Expression Studies 

All constructed plasmids were transformed into DH5ct cells for 
5 production of large amount of plasmid-DNA. The purified plasmid- 
DNA was transformed into BL21(DE3) cells for expression studies. The 
cells were grown in Terrific Broth in baffled flasks at 37°C to an OD of 1.0 
and the temperature was lowered to 23°C. The cultures were induced 
with 0.4 mM IPTG and were harvested 3 hours after induction. Cells 
10 were sonicated for 1 min in 50 mM HEPES, pH 7.5, 20% glycerol, 0.1% 
POG, 0.3 M NaCl, 10 mM pME and spun at 13,000 rpm for 10 min. The 
supernatants were analyzed on 10% Novex SDS-PAGE. 

B. Large-Scale Expression And Purification Of NS4A-Tethered 
Forms Of HCV NS3 3 . 181 Protease 

15 E. coli r BL21(DE3) cells harboring either plasmid pET-22b or pET- 

28b encoding various native, single, or multiple mutants of NS4A- 
tethered forms of NS3i-i8i were grown at 37°C in Terrific Broth 
supplemented with either 100 ug/ml of ampicillin (for pET-22b) or 25 
ug/ml kanamycin (for pET28-b) in 10-liter fermentor. When the ceil 

20 density reaches an OD of 2-3, the temperature was lowered to 23°C 

within 5 minutes and cells were induced with 0.4 mM IPTG. Cells were 
harvested 3 hours after induction and frozen at -20 °C prior to 
purification. 

Cell pellets were resuspended in 600 ml of lysis buffer containing 
25 50 mM HEPES, pH 7.4, 10% glycerol, 0.3 M NaCl, 0.1% pOG, 2 mM (JME 
(buffer A), homogenized using a cell homogenizer (Omni Mixer ES) for 
2 min and the cells were disrupted by two passes through a 
Microfluidizer (Microfluidics Model #M-110F) at 10,000 p.s.i. The lysate 
was centrifuged at 85,000 x g for 45 min. The supernatant was filtered 
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through 0.8 micron filter units (Nalgene) and applied at 40 ml/min to a 
11-ml Ni-imidodiacetate (POROS 20 MC resin) column in the presence 
of 20 mM immidazole on BIOCAD (Perseptive Biosystems). The 
column was washed with 10 column volumes of buffer A, followed by 

5 15 column volume of buffer A containing 1.0 M NaCl and 20 mM 
imidazole (buffer B). The bound protease was eluted with the elution 
buffer (buffer B containing 250 mM imidazole). The eluted fractions 
containing the protease were pooled and dialyzed versus 16 liters of 50 
mM HEPES, pH 7.4, 10% glycerol 1 M NaCl, 10 mM pME in order to 

10 remove the imidazole and the detergent. 

When the removal of the N-terminal histidine tag was required, 
human thrombin (Enzyme Research) was added to the eluted, pooled 
fractions at a thrombin:protease ratio of 8 units per mg of protease and 
thrombin cleavage was allowed to proceed during the dialysis step for 18 

15 hours. The dialyzed, thrombin-cleaved protease was applied to 3 
sephacryl-100 sizing column (26 x 60cm, Pharmacia) in series, 
equilibrated in of 50 mM HEPES, pH 7.4, 10% glycerol, 1 M NaCl, 10 mM 
bME at 0.5 ml/min. Fractions containing purified protease at above 
>95% homogeneity as judged by SDS-PAGE were pooled and flash- 

20 frozen at -80 °C 

C Large-Scale Expression And Purification Of NS4A-Tethered 
Forms Of HCV NS3 3 . 6 3i Protease 

E. coli, BL21(DE3) cells harboring either plasmid pET-22b or pET- 
28b encoding various native, single, or multiple mutants of NS4A- 

25 tethered forms of NS3i-i8i were grown at 37°C in Terrific Broth 

supplemented with either 100 ^ig/ml of ampicillin (for pET-22b) or 25 
(ig/ml kanamycin (for pET28-b) in 10-liter fermentor. When the cell 
density reaches an OD of 2-3, the temperature was lowered to 23°C 
within 5 minutes and cells were induced with 0.4 mM IPTG. Cells were 
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harvested 3 hours after induction and frozen at -20 °C prior to 
purification. 

Cell pellets were resuspended in 600 ml of lysis buffer containing 
50 mM HEPES, pH 7.4, 10% glycerol, 0.3 M NaCl, 0.1% pOG, 2 mM pME 

5 (buffer A), homogenized using a cell homogenizer (Omni Mixer ES) for 
2 min and the cells were disrupted by two passes through a 
Microfluidizer (Microfluidics Model #M-110F) at 10,000 p.s.i. The lysate 
was centrifuged at 85,000 x g for 45 min. The supernatant was filtered 
through 0.8 micron filter units (Nalgene) and applied at 40 ml/min to a 

10 11-ml Ni-imidodiacetate (POROS 20 MC resin) column in the presence 
of 20 mM immidazole on BIOCAD (Perseptive Biosystems). The 
column was washed with 10 column volumes of buffer A, followed by 
15 column volume of buffer A containing 1.0 M NaCl and 20 mM 
imidazole (buffer B). The bound protease was eluted with the elution 

15 buffer (buffer B containing 250 mM imidazole). The eluted fractions 
containing the protease were pooled and dialyzed versus 16 liters of 50 
mM HEPES, pH 7.4, 10% glycerol, 1 M NaCl, 10 mM pME in order to 
remove the imidazole and the detergent. 

When the removal of the N-terminal histidine tag was required, 
20 human thrombin (Enzyme Research) was added to the eluted, pooled 
fractions at a thrombin:protease ratio of 8 units per mg of protease and 
thrombin cleavage was allowed to proceed during the dialysis step for 18 
hours. The dialyzed, thrombin-cleaved protease was applied to 3 
sephacryl-100 sizing column (26 x 60cm, Pharmacia) in series, 
25 equilibrated in of 50 mM HEPES, pH 7.4, 10% glycerol, 1 M NaCl, 10 mM 
pME at 0.5 ml/min. Fractions containing purified protease at above 
>95% homogeneity as judged by SDS-PAGE were pooled and flash- 
frozen at -80 °C 
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EXAMPLE 4 

Molecular Weight Determination Of Various NS3 Protease Forms 
By Size Exclusion Chromatography 

Two hundred ul of various purified proteins were applied to a 
5 calibrated Superdex-75 HR (1cm x 30 cm) FPLC column equilibrated with 
25 mM HEPES, pH 7.4, 1M NaCl and 10% glycerol and 10 mM pME at 0.5 
ml/min. The column was precalibrated using Pharmacia standard 
calibration proteins (BSA: 67 KDa; Ovalbumin: 43 KDa; 
Chymotrypsinogen A: 31 KDa; Ribonuclease A: 13.7 KDa). Protein 
10 elution was monitored at 280 nm. 



The following covalent NS4A-NS3 complexes described above 
were characterized by the above method: 



HIS-NS4A 2 i.32-GSGS-NS33- 18 i 
HIS-NS4A 21 . 32 -GSGS-NS33.i8i/I17K 
15 HIS-NS4A 21 .3 2 -GSGS-NS33.,8i/H8K 
HIS-NS4 A 21 . 32 -GSGS-NS33-i8i /S139A 
HIS-NS4A 2 i. 32 -GSGS-NS33. 181 /S139A, I17K 
HIS-NS4A 21 . 3r GSGS-NS33.i 81 /S139A, I18K 

HIS-NS4A 21 . 32 -PAGG-NS33.i8i 
20 HIS-NS4A2L32-PAGG-NS3M81/I17K 

HIS-NS4A 21 . 32 -PAG-NS33. l8 i / I17K 

HIS-NS4A 21 . 32 -GSGS-NS3 3<31 . 
HIS-NS4A 21 . 32 -GSGS-NS3 M31 /I17K 
HIS-NS4A 21 . 32 -GSGS-NS3 W31 /I18K 
25 HIS-NS4A 21 . 32 -GSGS-NS3 3 . 63 i/S139A 

HIS-NS4A 2 i. 32 -GSGS-NS3 M31 /S139A, I17K 
HIS-NS4A 21 . 32 -GSGS-NS3 W31 /S139A, I18K 

Of those constructs characterized, all covalent NS4A-NS3 
complexes containing a three amino acid linker resulted in aggregated 
30 forms, as judged by size exclusion chromatography. NS4A-tethered 
forms in which a point mutation at position 17 or 18 had not been 
introduced also resulted in aggregated forms, although they exhibited 
activity identical to that of the monodispersed forms of the protease. 
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Covalent NS4A-NS3 complexes which contained a four amino 
acid linker and a point mutation at position 17 and/or 18 resulted in 
active, monodispersed proteins with apparent molecular weights 
smaller than predicted as determined by size exclusion chromatography. 

5 EXAMPLE 5 

Determination of Proteolytic Activity 

Following expression and purification, newly engineered 
recombinant species were assayed for proteolytic activity utilizing a 1D- 
HPLC (reverse-phase chromatography) technique. Assays were 

10 conducted using the 5A/5B (P8P8') substrate DTEDWCC*SMSYTWTG- 
K (SEQ ID NO: 25) in 25 mM Tris-HCl pH 7.5, 150 mM NaCl, 0.5 mM 
EDTA, 10 mM DTT, 10% glycerol, and 0.05% lauryl maltoside. 
Concentration of all proteins were determined by BIORAD dye method). 
The catalytic domain His-NS3M 81 (batch # 51072-92E) was preincubated at 

15 a concentration of 250 nM in the presence of 20 |iM 4 A peptide 

(KKGSWrVGRTVLSGKPAIIPKK) for 15 minutes at 4°C. This mixture 
was then diluted into the reaction volume at a final concentration of 8 

\iM 4A peptide and 100 nM catalytic domain. Reactions were incubated 
at room temperature for 60 minutes and were quenched with an equal 
20 volume of 10% phosphoric acid. Following injection, cleavage products 
were monitored under a linear 0-80% acetonitrile gradient in 0.1% TFA. 
The product P1'P8'K peak areas were automatically converted to product 
quantity in nanomoles by a standard curve. 

25 The various covalent NS4A-NS3 complexes whose proteolytic efficiency 
has been determined according to the above method, and the results of 
each determination, are shown in Table 1. 
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Table 1. 

Catalytic Efficiency Of Various Forms Of NS3 Protease 



Construct 


kc at (min ) 


K m (fiM) 




NS3 l . A3l -NS4A,. 54 


10 ±2 


20 ±2 


(8 ±2) XIO 3 


His-NS3 M81 + NS4A Peptide* 


3±1 


80 ± 20 


" 3 
(0.5 ±0.2) X 10 


his-^S4A 2 , 3r CiSGS.MS3,. 18l 


9±2 


19 + 3 


(8 ±2) XIO 3 


rtis-^S4A 21 . 3r G5CS-WS3 M81 /117K 


16±3 


20±2 


(14 ±2) XIO 3 


His-NikAn^-fiSCS-N^/IiiK 


10±2 


22±2 


(8 ±2) X10 3 



5 



a [E] = 0.25 nM, [NS4A Peptide] = 10 |iM 

As can be seen from the forgoing results, all covalent NS4A-NS3 
complexes were shown to have an equivalent catalytic efficiency to that 
10 of full-length NS3 1 _ 63r NS4A 1 _ 54 . In contrast, the non-covalent complex 
of NS3 M81 with the NS4A peptide (0.1:8 jiM), KK-(NS4A 21 _ 39 )-KK, had an 
catalytic activity which is 8 fold lower than the full-length NS3i_ 631 - 
NS4Au54. 

15 Example 6 

High Throughput Screening Assays 
Using Content NS4A-NS3 Complies 

The claimed covalent NS4A-NS3 complexes are useful in 
20 screening methods for identifying NS3 protease inhibitors. One such 
method in which the claimed covalent complexes can be used is 
illustrated below. 

Surface Plasmon Resonance Assay 

25 

The present example illustrates a method for determining if a 
compound can be useful as an HCV protease inhibitor using the surface 
plasmon resonance assay. Figures 5A and 5B schematically depict the 
technique. 
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BIAcore™ is a processing unit for Biospecific Interaction Analysis. 
The processing unit integrates an optical detection system with an 
autosampler and a microfluidic system. BIAcore™ uses the optical 
phenomena of surface plasmon resonance to monitor interaction 

5 between biomolecules. 

SPR is a resonance phenomenon between incoming photons and 
electrons on the surface of thin metal film. Resonance occurs at a sharply 
defined angle of incident light. At this angle, called the resonance angle, 
energy is transferred to the electrons in the metal film, resulting in a 

10 decreased intensity of the reflected light. SPR response depends on a 

change in refractive index in the close vicinity of the sensor chip surface, 
and is proportional to the mass of analyte bound to the surface. The 
BIAcore™ continuously measures the resonance angle by a relative scale 
of resonance units (RU) and displays it as an SPR signal in a sensorgram, 

15 where RU are plotted as a function of time. 

BIAcore™ uses continuous flow technology. One interactant is 
immobilized irreversibly on the sensor chip, comprising a non- 
crosslinked carboxymethylated dextran providing a hydrophilic 
environment for bimolecular interaction. Solution containing the other 

20 interactant flows continuously over the sensor chip surface. As 
molecules from the solution bind to the immobilized ligand, the 
resonance angle changes resulting in a signal registered by the 
instrument. 

In this methodology, the enzymatic reactions are carried out 
25 outside of the BIAcore™, in reaction tubes or 96-well tissue culture 
plates, as it is conventionally done for any of the other available high 
throughput assays. The SPR is only used as a detection means for 
determination of the amount of an intact substrate remaining in a 
solution after the reaction is quenched. 
30 In order to measure the amount of the intact substrate prior to the 

addition of enzyme, a means of capturing the substrate onto the sensor 
chip had to be established. In addition, to satisfy the requirement for a 
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high throughput assay on the BIAcore™, the substrate needed to be 
removed from the surface after completion of analysis, so that the same 
surface can be used for subsequent reactions. To accomplish these two 
requirements, a phosphotyrosine is synthetically attached to one end of 

5 the substrate. The phosphotyrosine was chosen due to the commercial 
availability of an anti-phosphotyrosine monoclonal antibody. The 
antibody is covalently attached to the sensor chip by standard amine 
coupling chemistry. The anti-phosphotyrosine antibody, bound 
permanently to the chip, is used to capture the phosphotyrosine in a 

10 reversible manner. The antibody-phosphotyrosine interaction is 

ultimately used to capture and release the attached peptide substrate. 
After completion of analysis, the surface can be regenerated using 
various reagents such as 2 M MgCl2- 

When an intact peptide substrate is introduced onto the antibody 

15 surface, a large mass is detected by the instrument. To follow the extent 
of peptide cleavage, a mixture of peptide substrate and enzyme is 
incubated for the desired time and then quenched. Introduction of this 
mixture, containing both cleaved peptide and intact peptide, to a 
regenerated antibody surface results in detection by the instrument of a 

20 lower mass than that detected for the sample containing only intact 
peptide. The difference in the two values is then used to calculate the 
exact amount of intact peptide remaining after cleavage by the enzyme. 

Although the reduction in mass can be directly followed with 
many large substrates, due to the small mass of a typical synthetic 

25 peptide substrate (10-20 amino acids, 1-3 Daltons), the mass difference, 
and thus the signal difference between the intact and cleaved peptide, is 
very small within the signal to noise ratio of the instrument. To 
circumvent this low sensitivity, a biotin can be attached at the N- 
terminus of the peptide. Streptavidin can then be added, thus tagging 

30 the peptide. When the tagged peptide is introduced onto the antibody 
surface of the chip, the signal will be higher. The signal resulting from 
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introduction of a cleaved peptide which lacks the N-terminal half, (and 

thus the streptavidin), will be much lower. 

To carry out this method, an HCV protease 5A-5B peptide 

substrate, (such as 5A/5B substrate DTEDVVACSMSYTWYG-K (SEQ ID 
5 NO: 91)) is synthesized with an additional phosphotyrosine at the C- 

terminus and a biotin at the N-terminus. The biotin is then tagged 

with streptavidin. An anti-phosphotyrosine monoclonal antibody, 4G10 

(Upstate Biotechnology Inc., Lake Placid, New York) is coupled to the 

sensor chip. In the absence of an active, uninhibited HCV protease, 
10 introduction of the intact phosphotyrosine peptide results in a large 

signal (large mass unit/large signal) through its interaction with the 

anti-phosphotyrosine monoclonal antibody (Mab). 

The protease-catalyzed hydrolysis of the phosphotyrosine- 

biotinylated peptide is carried out in a 96 well plate. The reaction is 
15 stopped with an equal volume of mercuribenzoate. The cleaved peptide 

which lacks the tagged streptavidin (less mass) results in the loss of 

response units (lower signal). 

Using this method, numerous compounds can be tested for their 

inhibitory activity since the antibody surface can be regenerated 
20 repetitively with 2 M MgCl 2 . 

Procedure for Coupling Anti-phosphotyrosine Mab t o the Sensor Chip 

The anti-phosphotyrosine Mab is coupled to the 
25 carboxymethylated dextran surface of a sensor chip in the following 
manner. The flow rate used throughout the coupling procedure is 5 
jll/min. The surface is first activated with a 35 \il injection of NHS/EDC 
(N-hydroxysuccinimide/N-dimethyllaminopropyl-N'- 
ethylcarbodiimide-HCl). This is followed by a 40 ml injection of Mab 
30 4G10 at 50 \ig/ml in 10 mM sodium acetate buffer, pH=4.0. Any 

remaining activated esters are then blocked by the injection of 35 [d of 
1 M ethanolamine. These conditions result in the immobilization of 
approximately 7,500 response units (420 |LiM) of antibody. 
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Binding of Peptide and Regeneration of Mab 4G10 Surface 

The flow rate used throughout the BIAcore analysis run is 5 
5 |al/min. A 4 pi injection containing strep tavidin-tagged peptide (peptide 
concentration at 2|liM, streptavidin binding sites concentration at 9|iM) is 
carried out. The amount of streptavidin-tagged peptide bound to the 
antibody surface (in response units) is measured 30 seconds after the 
injection is complete. 

10 

Regeneration of sensor chip surface 

Regeneration of the Mab 4G10 surface is achieved using a 4 ^1 
pulse of 2 M MgCb after each peptide injection. Surfaces regenerated up 
15 to 500 times still showed 100% binding of tagged peptide. 

Determination of the Optimal Concentratio n of Peptide and 
Streptavidin 

20 To determine the optimal peptide concentration, a standard curve 

was generated using various amounts of peptide (0-10 \iM) in the 
presence of excess streptavidin. A value in the linear range, 2 \iM, was 
chosen for standard assay conditions. 

The amount of streptavidin required to completely tag the peptide 

25 is determined using a peptide concentration of 2.5 \iM and titrating the 
amount of streptavidin (\iM of binding sites). All the peptides were 
shown to be completely tagged when streptavidin concentrations greater 
than 3 \iM (approximately equimolar to the peptide concentration) were 
used. A streptavidin concentration of 9 |j,M (a 4.5 fold excess) was 

30 chosen for standard assay conditions. 

Application of Described Methodology to 
Cpvalent HCV NS4A-NS3 Cpmpleyg 

35 

The HCV protease 5A/5B peptide substrate, (such as 5A/5B 
substrate DTEDWACSMSYTWYG-K (SEQ ID NO: 91)), with a 
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phophotyrosine synthetically attached to the C-terminus and a biotin 
attached at the N-terminus, is synthesized. Anti-phosphotyrosine 
monoclonal antibody, 4G10 is coupled to the sensor chip. 

In the absence of active, uninhibited covalent HCV NS4A-NS3 
5 complex, the introduction of the intact streptavidin-tagged biotinylated 
phosphotyrosine peptide to the sensor chip results in a large signal (large 
mass unit/large response units) through its interaction with the anti- 
phosphotyrosine monoclonal antibody. 

The protease-catalyzed hydrolysis of the phosphotyrosine- 
10 biotinylated peptide is carried out with and without a suspected inhibitor 
in a 96 well plate. The reaction is stopped with an equal volume of the 
quenching buffer containing mercuribenzoate. Streptavidin is then 
added to tag the peptide. The cleaved peptide, which lacks the 
streptavidin (less mass), results in the loss of response units. 
15 Using this assay, numerous compounds can be tested for their 

inhibitory activity since the antibody surface can be regenerated 
repetitively with 2 M MgCh- 

Standard Operating Procedure for BIAcore-based HCV Assay 

20 

Reactions are prepared in a 96-well tissue culture plate using the 
Reaction Buffer (50 mM HEPES, pH 7.4, 20 % glycerol, 150 mM NaCl, 
ImM EDTA, 0.1% Tween-20,1 mM DTT ) as diluent. The final reaction 
volume is 100 Sample with the peptide alone (Biotin-DTEDVVAC 

25 SMSYTWTGKpY) is prepared by addition of 10 |xl of peptide stock at 100 
^iM (prepared in the reaction buffer) to 90 ^1 of reaction buffer, so that 
the final concentration of peptide is 10 ^M. Samples comprised of 
peptide and the covalent NS4A-NS3 complexes are prepared by addition 
of 10 \i\ of peptide stock at 100 \iM and 10 ^1 of covalent NS4A-NS3 stock 

30 at 0.17 mg/ml (both prepared in the reaction buffer) to 80 |xl of reaction 
buffer, so that the final concentration of peptide and the enzyme is 10 
and 0.1 |iM respectively. The reaction is held at 30°C for the specified 
time and then quenched. Quenching is achieved by transferring a 20-^1 
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Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe Tyr Gly Lys Ala 
355 360 365 

He Pro He Glu Ala He Arg Gly Gly Arg His Leu He Phe Cys His 
370 375 380 

Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Ser Gly Leu Gly 
385 390 395 400 

He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro 
405 410 415 

Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly 
420 425 430 

Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val Thr 
435 440 445 

Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr Thr 
450 455 460 

Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly Arg Thr 
465 470 475 480 

Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr Pro Gly Glu Arg 
485 490 495 

Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala 
500 505 510 

Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val Arg Leu 
515 520 525 

Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu 
530 535 540 

Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His He Asp Ala His 
545 550 555 560 

Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr Leu Val 
565 570 575 

Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser 
580 585 590 

Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys Pro Thr Leu His 
595 600 605 

Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu Val 
610 615 620 

Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala Cys Met Ser Ala 
625 630 635 640 

Asp Leu Glu Val Val Thr * 
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activity using a scintillation proximity assay (SPA, Amersham Life 
Science Inc., Arlington Height, IL ). The unwinding activity present in 
this covalent His-NS4A 21 . 32 -GSGS-NS3 3 . 63l complex was compared with 
that of the full length His-NS3 1 _ 631 -NS4A 1 . 54 complex under their 
5 corresponding optimal buffer conditions. The double stranded RNA 
substrate (Oligos, Etc., Inc. Wilsonville, OR) used in the assay contained a 
template S'- GCU CGC CCG GGG AUG CUC UAG GAA UAC ACG UUC 
GAU-3' (SEQ ID NO: 121) annealed to a primer 5'-C UAGAG GAUCCC 
CGGGCG AGC CCU AUA GUG AGU CGU-3' (complementary 
10 sequences of the template and the primer are underlined). This 
substrate is end-labeled with ^P using T4 polynucleotide kinase. 

The assay conditions for the covalent His-NS4A 21 „ 32 -GSGS-NS3 3 . 631 
complex were 100 mM MOPS [pH 7.0], 0.5 mM MgCl 2 , 2 mM ATP, 0.5 
mM DTT, 100 mg/ml BSA, 2% dimethylsulfoxide (DMSO) and 1 U 
15 RNase inhibitor (5 prime->3 prime, Inc., Boulder, CO). For the full 

length His-NS3 1 . 631 /NS4A 1 _ 54 complex, the assay conditions were 100 mM 
PIPES [pH 6.0], 1 mM MgCl 2 , 2 mM ATP, 0.6 mM DTT, 100 mg/ml BSA 
and 1 U RNase inhibitor. In both reactions, 0.5 nM double stranded 
RNA substrate in a final volume of 50 ml was used. The reaction was 
20 carried out at 37 ooC for 1 h and terminated by an addition of 10 ml of 0.5 
M EDTA. The released primer was captured using 60 ml of 100 nM 
biotinylated capture oligomer (5'-biotin-GCT-CGC-CCG-GGG-ATC-CTC- 
TAG-3') (Gibco/BRL, Grand Island, NY) (SEQ ID NO: 123) in 2X 
hybridization buffer (40 mM HEPES [pH 7.3], 2M NaCl, 2 mg/ml BSA) at 
25 37 ooC for 1 h. The primer-oligomer complex was retained by 

Streptavidin coated SPA beads (SPA, Amersham Life Science Inc., 
Arlington Height, IL ), filtered and washed thoroughly with wash buffer 
(20 mM HEPES [pH 7.3], 15 mM NaCl, 1.5 mM sodium citrate and 0.05% 
SDS). The amount of the released labeled primer was quantified using a 
30 TopCount reader (Packard A991200, Meriden, CT). 

As shown in Fig. 6, the covalent His-NS4A 21 .3 2 -GSGS-NS3 3 _ 631 
displayed nucleic acid unwinding activity which was proportional to the 
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concentration of enzyme. In the linear range of the assay for both 
enzymes (1 - 10 pM), about 5-6 fold more product was released by the 
His-NS4A 21 . 32 -GSGS-NS3 3 . 631 than that from an equivalent concentration 
of full length His-NS3 1 . 631 /NS4A 1 _ 54 complex. In addition, 10 fold less 

5 covalent His-NS4A 21 _ 32 -GSGS-NS3 >631 complex was required to yield a 
similar percentage of unwound products compared with the full length 
His-NS3 1 _ 631 /NS4A 1 _ 54 complex in the corresponding reactions. 

The nucleic acid unwinding activity associated with the 
recombinant covalent His-NS4A 21 _ 32 -GSGS-NS3 3 _ 631 complex is useful for 

10 screening inhibitors of this function. For antiviral screening, 

compounds were tested at concentrations of less than 40 mM in the assay 
conditions as described above except that 0.3 nM of the double stranded 
RNA substrate and 20 pM of the covalent His-NS4A 21 . 32 -GSGS-NS3 3 . 631 
complex were used in a reaction which was carried out at room 

15 temperature for 30 minutes. The inhibition of the enzyme was 
monitored by a decrease in the level of released labeled primer as 
reflected in fewer counts in the capture assay. IC 50 of the inhibitory 
compounds was determined as the concentration of the compounds 
required to inhibit 50% of the unwinding activity. 

20 

EXAMPLES 
Determination of A TPase activity 

ATPase activity of the covalent His-NS4A 21 _ 32 -GSGS-NS3 3 . 631 
complex (SEQ ID NO: 4) was monitored by direct measurement of [a- 

25 32 P]ATP hydrolysis using thin layer chromatography. The enzyme was 
incubated with 1 mM ATP mixed with [a- 32 P]ATP (3000 Ci/mmol, 
approximately 0.5 mCi per reaction) in a reaction buffer containing 50 
mM HEPES [pH 7.3], 10 mM KC1, 0.5 mM DTT, 100 mg/ml bovine serum 
albumin, fraction V (BSA), 1 mM MgCl 2 in the presence or absence of 1 

30 mM polyuridylic acid (poly U) (Pharmacia, Piscataway, NJ) in a final 
volume of 10 ml. The reaction was carried out at 37 ~C for 1 h and 
terminated by an addition of 1 ml of 0.5 M EDTA. Half a microliter of 
the reaction mix was spotted onto a polyethyleneimine-ceilulose sheet 
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(SA Scientific Adsorbents Inc, Atlanta, GA) and developed by ascending 
chromatography in 0.375 M potassium phosphate buffer [pH 3.5]. The 
cellulose sheet was dried and quantified with a Storm 860 
Phospholmager (Molecular Dynamics, Sunnyvale, CA). 

5 The covalent His-NS4A 21 . 32 -GSGS-NS3 3 _ 631 complex exhibited poly 

U dependent ATPase activity which was proportional to the 
concentration of the enzyme. The ATP hydrolysis (8 - 13 fold increase) 
was enhanced in the presence of poly U at all enzyme concentrations 
examined (see Figure 7). Only minimal ATP hydrolysis was observed in 

10 the absence of poly U. 

The presence of ATPase activity in this covalent His-NS4A 21 . 32 - 
GSGS-NSSj.^! complex makes it suitable for screening inhibitors against 
HCV helicase. 
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WE CLAIM: 

1. A covalent HCV NS4A-NS3 complex comprising the 
central hydrophobic domain of native HCV NS4A peptide, a linker, and 

5 the HCV NS3 serine protease domain, wherein the hydrophobic domain 
of native HCV NS4A peptide is tethered by the linker to the amino 
terminus of the HCV NS3 protease domain. 

2. The covalent HCV NS4A-NS3 complex of claim 1, wherein 
10 the linker comprises at least about 4 amino acid residues. 

3. The covalent HCV NS4A-NS3 complex of claim 2, wherein 
the linker consists essentially of 4-6 amino acid residues. 

15 4. The covalent HCV NS4A-NS3 complex of claim 3, wherein 

the linker consists essentially of about 4 amino acid residues. 

5. The covalent HCV NS4A-NS3 complex of claim 4, wherein 
the linker has a sequence defined by SEQ ID NO: 21 or SEQ ID NO: 22. 

20 

6. The covalent HCV NS4A-NS3 complex of claim 5, having 
an amino acid sequence selected from the group consisting of SEQ ID 
NOs: 1-20. 

25 7. The covalent HCV NS4A-NS3 complex of claim 1 which is 

modified by replacement of one or more hydrophobic amino acid 
residues at position 17 or 18 of the HCV NS3 serine protease domain 
with a hydrophilic amino acid residue. 

30 8. The covalent HCV NS4A-NS3 complex of claim 7 in which 

one or more isoleucine residues at position 17 or 18 of the HCV NS3 
serine protease domain is replaced by a lysine residue. 
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9. The covalent HCV NS4A-NS3 complex of claim 8, having 
an amino acid sequence selected from the group consisting of SEQ ID 
NOs: 2-4, 6-8, 10, 12-14 and 16-18. 



5 10. The covalent HCV NS4A-NS3 complex of claim 1 which is 

modified by replacement of a serine residue at position 139 of the HCV 
NS3 serine protease domain with an alanine residue. 

11. The covalent HCV NS4A-NS3 complex of claim 10, having 
10 an amino acid sequence selected from the group consisting of SEQ ID 
NOs: 5-8, 15-18 and 20. 



12. A nucleic acid encoding a covalent HCV NS4A-NS3 
complex, which covalent HCV NS4A-NS3 complex comprises the 
15 central hydrophobic domain of native HCV NS4A peptide, a linker, and 
the HCV NS3 serine protease domain, wherein the hydrophobic domain 
of native HCV NS4A peptide is tethered by the amino acid linker to the 
amino terminus of the HCV NS3 protease domain. 



20 13. The nucleic acid of claim 12, wherein the linker comprises a 

least about 4 amino acid residues. 

14. The nucleic acid of claim 13, wherein the linker consists 
essentially of 4-6 amino acid residues. 

25 

15. The nucleic acid of claim 14, wherein the linker consists 
essentially of 4 amino acid residues. 



16. The nucleic acid of claim 15, wherein the amino acid linker 
30 has a sequence deHned by SEQ ID NO: 21 or SEQ ID NO: 22. 
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17. The nucleic acid of claim 16, which encodes a covalent 
HCV NS4A-NS3 complex having an amino acid sequence selected from 
the group consisting of SEQ ID NOs: 1-20. 

5 18. The nucleic acid of claim 12, which encodes a covalent HCV 

NS4A-NS3 complex which is modified by replacement of one or more 
hydrophobic amino acid residues at position 17 or 18 of the HCV NS3 
serine protease domain with a hydrophilic amino acid residue. 

10 19. The nucleic acid of claim 18 which encodes a covalent HCV 

NS4A-NS3 complex in which one or more isoleucine residues at 
position 17 or 18 of the HCV NS3 serine protease domain are replaced by 
a lysine residue. 

15 20. The nucleic acid of claim 19, which encodes a covalent 

HCV NS4A-NS3 complex having an amino acid sequence selected from 
the group consisting of SEQ ID NOs: 2-4, 6-8, 10, 12-14 and 16-18. 

21. The nucleic acid of claim 12, which encodes a covalent HCV 
20 NS4A-NS3 complex which is modified by replacement of a serine 

residue at position 139 of the HCV NS3 serine protease domain with an 
alanine residue. 

22. The nucleic acid of claim 21, which encodes a covalent 

25 HCV NS4A-NS3 complex having an amino acid sequence selected from 
the group consisting of SEQ ID NOs: 5-8, 15-18 and 20. 

23. A recombinant vector comprising the nucleic acid of claim 
12, which vector is capable of directing expression of the nucleic acid. 



30 



24. A host cell comprising the recombinant vector of claim 23. 
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25. A method for making a covaient HCV NS4A-NS3 complex 
comprising culturing the host cell of claim 24 under conditions in which 
the nucleic acid or vector is expressed. 



5 26. A method for identifying an HCV NS3 protease inhibitor, 

comprising (a) contacting a covaient HCV NS4A-NS3 complex of claim 1 
with a peptide substrate and a suspected protease inhibitor under 
conditions in which proteolysis can occur; and (b) detecting whether the 
covaient HCV NS4-NS3 complex has cleaved the substrate. 

10 

27. A method for identifying an inhibitor of the nucleic acid 
unwinding activity of an HCV NS3 helicase, comprising (a) contacting a 
covaient HCV NS4A-NS3 complex of SEQ ID NO: 4, 12-19 or 20 with a 
double stranded RNA substrate and a suspected inhibitor under 

15 conditions in which unwinding of the substrate can occur; and (b) 

detecting whether and the extent to which the covaient HCV NS4-NS3 
complex has unwound the substrate. 

28. A method for identifying an inhibitor of an HCV NS3 

20 helicase, comprising (a) contacting a covaient HCV NS4A-NS3 complex 
of SEQ ID NO: 4, 12-19 or 20 with ATP and a suspected inhibitor under 
conditions in which ATP hydrolysis can occur; and (b) detecting whether 
the covaient HCV NS4-NS3 complex has exhibited ATPase activity. 



25 



30 
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(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: Schering Corp. 

(B) STREET: 2000 Galloping Hill Road 

(C) CITY: Kenilworth 

(D) STATE: New Jersey 

(E) COUNTRY: USA 

(F) ZIP: 07090 

(G) TELEPHONE: 908-298-5056 

(H) TELEFAX: 908-298-5388 

(ii) TITLE OF INVENTION: Covalent Complexes of Hepatitis C Virus 
NS3 Protease and NS4A Cofactor Peptide 

(iii) NUMBER OF SEQUENCES: 123 

(iv) COMPUTER READABLE FORM : 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: Power Macintosh 

(C) OPERATING SYSTEM: 8.0.1 

(D) SOFTWARE: Microsoft Word 6.0.1 

(v) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 60/067,315 

(B) FILING DATE: 28-NOV-1997 

(A) APPLICATION NUMBER: US 60/094,331 

(B) FILING DATE: 28-JUL-1998 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 217 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

Arg Gly Ser His Met Gly Ser Val Val lie Val Gly Arg He He Leu 
20 25 30 

Ser Gly Ser Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 



SUBSTITUTE SHEET ( rule 26 ) 



WO 99/28482 



2 



PCT/US98/24528 



35 40 45 

Leu Gly Cys He He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn 
100 105 HO 

Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 

His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 

Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly 
165 170 175 

Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala 
180 185 190 

Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
195 200 205 

Ser Met Glu Thr Thr Met Arg Ser * 
210 215 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 217 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He lie Leu 
20 25 30 

Ser Gly Ser Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

Leu Gly Cys Lys lie Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
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50 



55 



60 



Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn 
100 105 110 

Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 

His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 

Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly 
165 170 175 

Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala 
180 185 190 

Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
195 200 205 



Ser Met Glu Thr Thr Met Arg Ser 
210 215 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 217 amino acids 

(B) TYPE: amino acid 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

Ser Gly Ser Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

Leu Gly Cys He Lys Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
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65 70 75 80 

Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn 
100 105 110 

Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 

His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 

Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly 
165 170 175 

Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala 
180 185 190 

Val Cys Thr Arg Gly Val Ala . Lys Ala Val Asp Phe Val Pro Val Glu 
195 200 205 

Ser Met Glu Thr Thr Met Arg Ser * 
210 215 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 217 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

Ser Gly Ser Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

Leu Gly Cys Lys Lys Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
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85 90 95 

Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asri 
100 105 110 

Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 

His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 

Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly 
165 170 175 

Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala 
180 185 190 

Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
195 200 205 

Ser Met Glu Thr Thr Met Arg Ser * 
210 215 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 216 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
1 5 10 15 

Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

Ser Gly Ser Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

Leu Gly Cys lie lie Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
^ 50 55 60 

Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

Lys Thr Leu Ala Gly Pro Lys Gly Pro lie Thr Gin Met Tyr Thr Asn 
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100 



105 



110 



Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 

His Ala Asp Val lie Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 

Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ala Gly Gly 
165 170 175 

Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly lie Phe Arg Ala Ala 
180 185 190 

Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
195 200 205 

Ser Met Glu Thr Thr Met Arg Ser 
210 215 



{21 INFORMATION FOR SEQ ID NO : 6 : 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 216 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

Ser Gly Ser Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

Leu Gly Cys Lys He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn 
100 105 110 

Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
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115 120 

Leu Thr Pro Cys Thr Cys Gly Ser 
130 135 

His Ala Asp Val lie Pro Val Arg 
145 150 

Leu Leu Ser Pro Arg Pro Val 
165 

Pro Leu Leu Cys Pro Ser Gly His 
180 

Val Cys Thr Arg Gly Val Ala Lys 
195 200 

Ser Met Glu Thr Thr Met Arg Ser 
210 215 



125 

Ser Asp Leu Tyr Leu Val Thr Arg 
140 

Arg Arg Gly Asp Ser Arg Gly Ser 
155 160 



Ala Val Gly He Phe Arg Ala Ala 
185 190 

Ala Val Asp Phe Val Pro Val Glu 
205 



Ser Tyr Leu Lys Gly Ser Ala Gly Gly 
170 175 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 217 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

Ser Gly Ser Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

Leu Gly Cys He Lys Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn 
100 105 110 

Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
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130 135 

His Ala Asp Val lie Pro Val Arg 
145 150 

Leu Leu Ser Pro Arg Pro Val Ser 
165 

Pro Leu Leu Cys Pro Ser Gly His 
180 

Val Cys Thr Arg Gly Val Ala Lys 
195 200 

Ser Met Glu Thr Thr Met Arg Ser 
210 215 



140 

Arg Arg Gly Asp Ser Arg Gly Ser 
155 160 

Tyr Leu Lys Gly Ser Ala Gly Gly 
170 175 

Ala Val Gly He Phe Arg Ala Ala 
185 190 

Ala Val Asp Phe Val Pro Val Glu 
205 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 217 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

Ser Gly Ser Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

Leu Gly Cys Lys Lys Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn 
100 105 110 

Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 

His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
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145 



150 



155 



160 



Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ala Gly Gly 
165 170 175 

Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala 
180 185 190 

Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
195 200 205 

Ser Met Glu Thr Thr Met Arg Ser * 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 217 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:9: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
1 5 10 * 15 

Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

Ser Pro Ala Gly Gly He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

Leu Gly Cys He He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn 
100 105 110 

Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 

His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 



210 



215 
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Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly 
165 170 175 

Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly lie Phe Arg Ala Ala 
180 185 190 

Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
195 200 205 

Ser Met Glu Thr Thr Met Arg Ser * 
210 215 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 217 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

Ser Pro Ala Gly Gly He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

Leu Gly Cys Lys He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn 
100 105 110 

Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 

His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 

Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly 
165 170 175 
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Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala 
180 185 190 

Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
195 200 205 

Ser Met Glu Thr Thr Met Arg Ser * 
210 215 



(2} INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 665 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

Ser Gly Ser Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

Leu Gly Cys He He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn 
100 105 110 

Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 

His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 

Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly 
165 170 175 

Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala 
180 185 190 
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Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
195 200 205 

Ser Met Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser 
210 215 220 

Pro Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro 
225 230 235 240 

Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin 
245 250 255 

Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly 
260 265 270 

Phe Gly Ala Tyr Met Ser Lys Ala His Gly He Asp Pro Asn He Arg 
275 280 285 

Thr Gly Val Arg Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr 
290 295 300 

Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp 
305 310 315 320 

He He He Cys Asp Glu Cys His Ser Thr Asp Ser Thr Thr He Leu 
325 330 335 

Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu 
340 345 350 

Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His 
355 360 365 

Pro Asn He Glu Glu Val Ala Leu Ser Asn Thr Gly Glu lie Pro Phe 
370 375 380 

Tyr Gly Lys Ala He Pro lie Glu Ala lie Arg Gly Gly Arg His Leu 
385 390 395 400 

lie Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu 
405 410 415 

Ser Gly Leu Gly lie Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val 
420 425 430 

Ser Val lie Pro Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala 
435 440 445 

Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val lie Asp Cys Asn 
450 455 460 

Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr 
465 470 475 480 

lie Glu Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg 
485 490 495 



Arg Gly Arg Thr Gly Arg Gly Arg Arg Gly lie Tyr Arg Phe Val Thr 
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500 



505 



510 



Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu 
515 520 525 

Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr 
530 535 540 

Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys 
545 550 555 560 

Gin Asp His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His 
565 570 575 

lie Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe 
580 585 590 

Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala 
595 600 605 

Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys 
610 615 620 

Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val 
625 630 635 640 

Gin Asn Glu Val Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 665 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

Ser Gly Ser Gly Ser lie Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

Leu Gly Cys Lys He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 



645 



650 



655 



Cys Met Ser Ala Asp Leu Glu Val Val 
660 665 
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65 



70 



75 



80 



Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn 
100 105 110 

Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 

His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 

Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly 
165 170 175 

Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala 
180 185 190 

Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
195 200 205 

Ser Met Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser 
210 215 220 

Pro Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro 
225 230 235 240 

Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin 
245 250 255 

Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly 
260 265 270 

Phe Gly Ala Tyr Met Ser Lys Ala His Gly He Asp Pro Asn He Arg 
275 280 285 

Thr Gly Val Arg Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr 
290 295 300 

Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp 
305 310 315 320 

He He lie Cys Asp Glu Cys His Ser Thr Asp Ser Thr Thr He Leu 
325 330 335 

Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu 
340 345 350 

Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser. Val Thr Val Pro His 
355 360 365 



Pro Asn He Glu Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe 
370 375 380 
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Tyr Gly Lys Ala lie Pro lie Glu Ala He Arg Gly Gly Arg His Leu 
385 390 395 400 

He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu 
405 410 415 

Ser Gly Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val 
420 425 430 

Ser Val He Pro Thr lie Gly Asp Val Val Val Val Ala Thr Asp Ala 
435 440 445 

Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn 
450 455 460 

Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr 
465 470 475 480 

He Glu Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg 
485 490 495 

Arg Gly Arg Thr Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr 
500 505 510 

Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu 
515 520 525 

Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr 
530 535 540 

Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys 
545 550 555 560 

Gin Asp His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His 
565 570 575 

He Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe 
580 585 590 

Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala 
595 600 605 

Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys 
610 615 620 

Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val 
625 630 635 640 

Gin Asn Glu Val Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala 
645 650 655 



Cys Met Ser Ala Asp Leu Glu Val Val 
660 665 



(2) INFORMATION FOR SEQ ID NO: 13: 
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<i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 665 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

Ser Gly Ser Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

Leu Gly Cys He Lys Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

Lys Thr Leu Ala Gly Pro Lys Gly Pro lie Thr Gin Met Tyr Thr Asn 
100 105 110 

Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 

His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 

Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly 
165 170 175 

Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala 
180 185 190 

Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
195 200 205 

Ser Met Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser 
210 215 220 

Pro Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro 
225 230 235 240 

Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin 
245 250 255 
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Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly 
260 265 270 

Phe Gly Ala Tyr Met Ser Lys Ala His Gly He Asp Pro Asn He Arg 
275 280 285 

Thr Gly Val Arg Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr 
290 295 300 

Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp 
305 310 315 320 

He He He Cys Asp Glu Cys His Ser Thr Asp Ser Thr Thr He Leu 
325 330 335 

Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu 
340 345 350 

Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His 
355 360 365 

Pro Asn He Glu Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe 
370 375 380 

Tyr Gly Lys Ala lie Pro He Glu Ala He Arg Gly Gly Arg His Leu 
385 390 395 400 

He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu 
405 410 415 

Ser Gly Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val 
420 425 430 

Ser Val He Pro Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala 
435 440 445 

Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn 
450 455 460 

Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr 
465 470 475 480 

lie Glu Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg 
485 490 495 

Arg Gly Arg Thr Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr 
500 505 510 

Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu 
515 520 525 

Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr 
530 535 540 

Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys 
545 550 555 560 



Gin Asp His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His 
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565 



570 



575 



He Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe 
580 585 590 

Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala 
595 600 605 

Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys 
610 615 620 

Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val 
625 630 635 640 

Gin Asn Glu Val Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala 
645 650 655 

Cys Met Ser Ala Asp Leu Glu Val Val 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 665 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

Ser Gly Ser Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

Leu Gly Cys Lys Lys Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn 
100 105 110 

Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 



660 



665 
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130 



135 



140 



His Ala Asp Val lie Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 

Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly 
165 170 175 

Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly lie Phe Arg Ala Ala 
180 185 190 

Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
195 200 205 

Ser Met Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser 
210 215 220 

Pro Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro 
225 230 235 240 

Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin 
245 250 255 

Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly 
260 265 270 

Phe Gly Ala Tyr Met Ser Lys Ala His Gly He Asp Pro Asn He Arg 
275 280 285 

Thr Gly Val Arg Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr 
290 295 300 

Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp 
305 310 315 320 

He He He Cys Asp Glu Cys His Ser Thr Asp Ser Thr Thr He Leu 
325 330 335 

Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu 
340 345 350 

Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His 
355 360 365 

Pro Asn He Glu Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe 
370 375 380 

Tyr Gly Lys Ala He Pro He Glu Ala He Arg Gly Gly Arg His Leu 
385 390 395 400 

He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu 
405 410 415 

Ser Gly Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val 
420 425 430 



Ser Val He Pro Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala 
435 440 445 
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Leu Met Thr Gly Tyr Thr Gly Asp 
450 455 

Thr Cys Val Thr Gin Thr Val Asp 
465 470 

He Glu Thr Thr Thr Val Pro Gin 
485 

Arg Gly Arg Thr Gly Arg Gly Arg 
500 

Pro Gly Glu Arg Pro Ser Gly Met 
515 520 



Phe Asp Ser Val He Asp Cys Asn 
460 

Phe Ser Leu Asp Pro Thr Phe Thr 
475 480 

Asp Ala Val Ser Arg Ser Gin Arg 
490 495 

Arg Gly He Tyr Arg Phe Val Thr 
505 510 

Phe Asp Ser Ser Val Leu Cys Glu 
525 



Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr 
530 535 540 

Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys 
545 550 555 560 

Gin Asp His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His 
565 570 575 

He Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe 
580 585 590 

Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala 
595 600 605 

Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys 
610 615 620 

Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val 
625 630 635 640 

Gin Asn Glu Val Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala 
645 650 655 

Cys Met Ser Ala Asp Leu Glu Val Val 
660 665 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 665 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 
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Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

Ser Gly Ser Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

Leu Gly Cys He He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

Lys Thr Leu Ala Gly Pro Lys Gly Pro lie Thr Gin Met Tyr Thr Asn 
100 105 110 

Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 

His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 

Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ala Gly Gly 
165 170 175 

Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala 
180 185 190 

Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
195 200 205 

Ser Met Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser 
210 215 220 

Pro Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro 
225 230 235 240 

Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin 
245 250 255 

Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly 
260 265 270 

Phe Gly Ala Tyr Met Ser Lys Ala His Gly He Asp Pro Asn He Arg 
275 280 285 

Thr Gly Val Arg Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr 
290 295 300 

Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asd 
305 310 315 320 
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lie lie lie Cys Asp Glu Cys His Ser Thr Asp Ser Thr Thr lie Leu 
325 330 335 

Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu 
340 345 350 

Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His 
355 360 365 

Pro Asn He Glu Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe 
370 375 380 

Tyr Gly Lys Ala He Pro He Glu Ala He Arg Gly Gly Arg His Leu 
385 390 395 400 

He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu 
405 410 415 

Ser Gly Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val 
420 425 430 

Ser Val He Pro Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala 
435 440 445 

Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn 
450 455 460 

Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr 
465 470 475 480 

He Glu Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg 
485 490 495 

Arg Gly Arg Thr Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr 
500 505 510 

Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu 
515 520 525 

Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr 
530 535 540 

Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys 
545 550 555 560 

Gin Asp His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His 
565 570 575 

He Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe 
580 585 590 

Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala 
595 600 605 

Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys 
610 615 620 



Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val 
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625 



630 



635 



640 



Gin Asn Glu Val Thr Leu Thr His Pro lie Thr Lys Tyr He Met Ala 
645 650 655 

Cys Met Ser Ala Asp Leu Glu Val Val 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 665 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

Ser Gly Ser Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

Leu Gly Cys Lys He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn 
100 105 110 

Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 

His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 

Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ala Gly Gly 
165 170 175 

Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala 
180 185 190 

Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 



660 



665 
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195 



200 



205 



Ser Met Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser 
210 215 220 

Pro Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro 
225 230 235 240 

Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin 
245 250 255 

Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly 
260 265 270 

Phe Gly Ala Tyr Met Ser Lys Ala His Gly He Asp Pro Asn He Arg 
275 280 285 

Thr Gly Val Arg Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr 
290 295 300 

Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp 
305 310 315 320 

He He He Cys Asp Glu Cys His Ser Thr Asp Ser Thr Thr He Leu 
325 330 335 

Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu 
340 345 350 

Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His 
355 360 365 

Pro Asn He Glu Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe 
370 375 380 

Tyr Gly Lys Ala He Pro He Glu Ala He Arg Gly Gly Arg His Leu 
385 390 395 400 

He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu 
405 410 415 

Ser Gly Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val 
420 425 430 

Ser Val He Pro Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala 
435 440 445 

Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn 
450 455 460 

Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr 
465 470 475 480 

He Glu Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg 
485 490 495 

Arg Gly Arg Thr Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr 



500 



505 



510 
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Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu 
515 520 525 

Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr 
530 535 540 

Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys 
545 550 555 560 

Gin Asp His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His 
565 570 575 

lie Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe 
580 585 590 

Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala 
595 600 605 

Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys 
610 615 620 

Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val 
625 630 635 640 

Gin Asn Glu Val Thr Leu Thr His Pro lie Thr Lys Tyr lie Met Ala 
645 650 655 



Cys Met Ser Ala Asp Leu Glu Val Val 
660 665 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 665 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

Ser Gly Ser Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

Leu Gly Cys He Lys Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 



SUBSTITUTE SHEET ( rule 26 ) 



WO 99/28482 



26 



PCT/US98/24528 



Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

Lys Thr Leu Ala Gly Pro Lys Gly Pro lie Thr Gin Met Tyr Thr Asn 
100 105 110 

Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 

His Ala Asp Val lie Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 

Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ala Gly Gly 
165 170 175 

Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly lie Phe Arg Ala Ala 
180 185 190 

Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
195 200 205 

Ser Met Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser 
210 215 220 

Pro Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro 
225 230 235 240 

Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin 
245 250 255 

Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly 
260 265 270 

Phe Gly Ala Tyr Met Ser Lys Ala His Gly He Asp Pro Asn He Arg 
275 280 285 

Thr Gly Val Arg Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr 
290 295 300 

Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp 
305 310 315 320 

He He He Cys Asp Glu Cys His Ser Thr Asp Ser Thr Thr He Leu 
325 330 335 

Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu 
340 345 350 

Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His 
355 360 365 

Pro Asn He Glu Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe 
370 375 380 
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Tyr Gly Lys Ala He Pro He Glu Ala He Arg Gly Gly Arg His Leu 
385 390 395 400 

He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu 
405 410 415 

Ser Gly Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val 
420 425 430 

Ser Val He Pro Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala 
435 440 445 

Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn 
450 455 460 

Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr 
465 470 475 480 

He Glu Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg 
485 490 495 

Arg Gly Arg Thr Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr 
500 505 510 

Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu 
515 520 525 

Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr 
530 535 540 

Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys 
545 550 555 560 

Gin Asp His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His 
565 570 575 

He Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe 
580 585 590 

Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala 
595 600 605 

Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys 
610 615 620 

Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val 
625 630 635 640 

Gin Asn Glu Val Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala 
645 650 655 

Cys Met Ser Ala Asp Leu Glu Val Val 
660 665 



(2) INFORMATION FOR SEQ ID NO: 18: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 665 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

Ser Gly Ser Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

Leu Gly Cys Lys Lys Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
' 50 55 60 

Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn 
100 105 110 

Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 

His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 

Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ala Gly Gly 
165 170 175 

Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala 
180 185 190 

Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
195 200 205 

Ser Met Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser 
210 215 220 

Pro Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro 
225 230 235 240 

Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin 
245 250 255 

Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly 
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260 



265 



270 



Phe Gly Ala Tyr Met Ser Lys Ala His Gly lie Asp Pro Asn lie Arg 
275 280 285 

Thr Gly Val Arg Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr 
290 295 300 

Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp 
305 310 315 320 

He He He Cys Asp Glu Cys His Ser Thr Asp Ser Thr Thr He Leu 
325 330 335 

Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu 
340 345 350 

Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His 
355 360 365 

Pro Asn He Glu Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe 
370 375 380 

Tyr Gly Lys Ala He Pro He Glu Ala He Arg Gly Gly Arg His Leu 
385 390 395 400 

He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu 
405 410 415 

Ser Gly Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val 
420 425 430 

Ser Val He Pro Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala 
435 440 445 

Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn 
450 455 460 

Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr 
465 470 475 480 

He Glu Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg 
485 490 495 

Arg Gly Arg Thr Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr 
500 505 510 

Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu 
515 520 525 

Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr 
530 535 540 

Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys 
545 550 555 560 



Gin Asp His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His 
565 570 575 
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lie Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe 
580 585 590 

Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala 
595 600 605 

Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys 
610 615 620 

Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val 
625 630 635 640 

Gin Asn Glu Val Thr Leu Thr His Pro lie Thr Lys Tyr lie Met Ala 
645 650 655 

Cys Met Ser Ala Asp Leu Glu Val Val 
660 665 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 671 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

Arg Gly Ser His Met Ala Tyr Ser Leu Thr Thr Gly Ser Val Val lie 
20 25 30 

Val Gly Arg lie lie Leu Ser Gly Ser Gly Ser lie Thr Ala Tyr Ser 
35 40 45 

Gin Gin Thr Arg Gly Leu Leu Gly Cys lie lie Thr Ser Leu Thr Gly 
50 55 60 

Arg Asp Lys Asn Gin Val Glu Gly Glu Val Gin Val Val Ser Thr Ala 
65 70 75 80 

Thr Gin Ser Phe Leu Ala Thr Cys Val Asn Gly Val Cys Trp Thr Val 
85 90 95 

Tyr His Gly Ala Gly Ser Lys Thr Leu Ala Gly Pro Lys Gly Pro lie 
100 105 110 

Thr Gin Met Tyr Thr Asn Val Asp Gin Asp Leu Val Gly Trp Gin Ala 
115 120 125 

Pro Pro Gly Ala Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp 
130 135 140 
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Leu Tyr Leu Val Thr Arg His Ala Asp Val He Pro Val Arg Arg Arg 
145 150 155 160 

Gly Asp Ser Arg Gly Ser Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu 
165 170 175 

Lys Gly Ser Ser Gly Gly Pro Leu Leu Cys Pro Ser Gly His Ala Val 
180 185 190 

Gly He Phe Arg Ala Ala Val Cys Thr Arg Gly Val Ala Lys Ala Val 
195 200 205 

Asp Phe Val Pro Val Glu Ser Met Glu Thr Thr Met Arg Ser Pro Val 
210 215 220 

Phe Thr Asp Asn Ser Ser Pro Pro Ala Val Pro Gin Ser Phe Gin Val 
225 230 235 240 

Ala His Leu His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro 
245 250 255 

Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro Ser 
260 265 270 

Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His Gly 
275 280 285 

He Asp Pro Asn He Arg Thr Gly Val Arg Thr He Thr Thr Gly Ala 
290 295 300 

Pro Val Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys 
305 310 315 320 

Ser Gly Gly Ala Tyr Asp He He He Cys Asp Glu Cys His Ser Thr 
325 330 335 

Asp Ser Thr Thr He Leu Gly He Gly Thr Val Leu Asp Gin Ala Glu 
340 345 350 

Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly 
355 360 365 

Ser Val Thr Val Pro His Pro Asn He Glu Glu Val Ala Leu Ser Asn 
370 375 380 

Thr Gly Glu He Pro Phe Tyr Gly Lys Ala lie Pro He Glu Ala He 
385 390 395 400 

Arg Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys Asp 
405 410 415 

Glu Leu Ala Ala Lys Leu Ser Gly Leu Gly He Asn Ala Val Ala Tyr 
420 425 430 

Tyr Arg Gly Leu Asp Val Ser Val He Pro Thr He Gly Asp Val Val 
435 440 445 
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Val Val Ala Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp 
450 455 460 

Ser Val He Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe Ser 
465 470 475 480 

Leu Asp Pro Thr Phe Thr He Glu Thr Thr Thr Val Pro Gin Asp Ala 
485 490 495 

Val Ser Arg Ser Gin Arg Arg Gly Arg Thr Gly Arg Gly Arg Arg Gly 
500 505 510 

He Tyr Arg Phe Val Thr Pro Gly Glu Arg Pro Ser Gly Met Phe Asp 
515 520 525 

Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu 
530 535 540 

Leu Thr Pro Ala Glu Thr Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr 
545 550 555 560 

Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Ser Val 
565 570 575 

Phe Thr Gly Leu Thr His He Asp Ala His Phe Leu Ser Gin Thr Lys 
580 585 590 

Gin Ala Gly Asp Asn Phe Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val 
595 600 605 

Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp Lys 
610 615 620 

Cys Leu He Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu Leu 
625 630 635 640 

Tyr Arg Leu Gly Ala Val Gin Asn Glu Val Thr Leu Thr His Pro He 
645 650 655 



Thr Lys Tyr He Met Ala Cys Met Ser Ala Asp Leu Glu Val Val 
660 665 670 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 671 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 
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Arg Gly Ser His Met Ala Tyr Ser Leu Thr Thr Gly Ser Val Val He 

20 25 30 

Val Gly Arg He He Leu Ser Gly Ser Gly Ser He Thr Ala Tyr Ser 

35 40 45 

Gin Gin Thr Arg Gly Leu Leu Gly Cys lie lie Thr Ser Leu Thr Gly 

50 55 60 



Arg Asp Lys Asn Gin Val Glu Gly Glu Val Gin Val Val Ser Thr Ala 
65 70 75 80 

Thr Gin Ser Phe Leu Ala Thr Cys Val Asn Gly Val Cys Trp Thr Val 
85 90 95 

Tyr His Gly Ala Gly Ser Lys Thr Leu Ala Gly Pro Lys Gly Pro He 
100 105 ~ HO 

Thr Gin Met Tyr Thr Asn Val Asp Gin Asp Leu Val Gly Trp Gin Ala 
115 120 125 

Pro Pro Gly Ala Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp 
130 135 140 

Leu Tyr Leu Val Thr Arg His Ala Asp Val He Pro Val Arg Arg Arg 
145 150 155 160 

Gly Asp Ser Arg Gly Ser Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu 
165 170 175 

Lys Gly Ser Ala Gly Gly Pro Leu Leu Cys Pro Ser Gly His Ala Val 
180 185 190 

Gly He Phe Arg Ala Ala Val Cys Thr Arg Gly Val Ala Lys Ala Val 
195 200 205 



Asp Phe Val Pro Val Glu Ser Met 
210 215 

Phe Thr Asp Asn Ser Ser Pro Pro 
225 230 

Ala His Leu His Ala Pro Thr Gly 
245 

Ala Ala Tyr Ala Ala Gin Gly Tyr 
260 



Glu Thr Thr Met Arg Ser Pro Val 
220 

Ala Val Pro Gin Ser Phe Gin Val 
235 240 

Ser Gly Lys Ser Thr Lys Val Pro 
250 255 

Lys Val Leu Val Leu Asn Pro Ser 
265 270 



Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His Gly 
275 280 285 

He Asp Pro Asn He Arg Thr Gly Val Arg Thr He Thr Thr Gly Ala 
290 295 300 

Pro Val Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys 
305 310 315 320 

Ser Gly Gly Ala Tyr Asp He He lie Cys Asp Glu Cys His Ser Thr 
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325 



330 



335 



Asp Ser Thr Thr He Leu Gly He Gly Thr Val Leu Asp Gin Ala Glu 
340 345 350 

Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly 
355 360 365 

Ser Val Thr Val Pro His Pro Asn He Glu Glu Val Ala Leu Ser Asn 
370 375 380 

Thr Gly Glu He Pro Phe Tyr Gly Lys Ala He Pro He Glu Ala He 
385 390 395 400 

Arg Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys Asp 
405 410 415 

Glu Leu Ala Ala Lys Leu Ser Gly Leu Gly He Asn Ala Val Ala Tyr 
420 425 430 

Tyr Arg Gly Leu Asp Val Ser Val He Pro Thr He Gly Asp Val Val 
435 440 445 

Val Val Ala Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp 
450 455 460 

Ser Val He Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe Ser 
465 470 475 480 

Leu Asp Pro Thr Phe Thr He Glu Thr Thr Thr Val Pro Gin Asp Ala 
485 490 495 

Val Ser Arg Ser Gin Arg Arg Gly Arg Thr Gly Arg Gly Arg Arg Gly 
500 505 510 

He Tyr Arg Phe Val Thr Pro Gly Glu Arg Pro Ser Gly Met Phe Asp 
515 520 525 

Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu 
530 535 540 

Leu Thr Pro Ala Glu Thr Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr 
545 550 555 560 

Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Ser Val 
565 570 575 

Phe Thr Gly Leu Thr His He Asp Ala His Phe Leu Ser Gin Thr Lys 
580 585 590 

Gin Ala Gly Asp Asn Phe Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val 
595 600 605 

Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp Lys 
610 615 620 



Cys Leu He Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu Leu 
625 630 635 640 
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Tyr Arg Leu Gly Ala Val Gin Asn Glu Val Thr Leu Thr His Pro He 
645 650 655 

Thr Lys Tyr He Met Ala Cys Met Ser Ala Asp Leu Glu Val Val 
660 665 670 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

Gly Ser Gly Ser 
1 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 

Pro Ala Gly Gly 
1 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1964 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : cDNA 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1 . . 1964 

(2) INFORMATION FOR SEQ ID NO: 23: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 632 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

Met Ala Pro lie Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly 
1 5 10 15 

Cys lie lie Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly 
20 25 30 

Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys 
35 40 45 

Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys Thr 
50 55 60 

Leu Ala Gly Pro Lys Gly Pro lie Thr Gin Met Tyr Thr Asn Val Asp 
65 70 75 80 

Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu Thr 
85 90 95 

Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala 
100 105 110 

Asp Val lie Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu 
115 120 125 

Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu 
130 135 140 

Leu Cys Pro Ser Gly His Ala Val Gly lie Phe Arg Ala Ala Val Cys 
145 150 155 160 

Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser Met 
165 170 175 

Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro 
180 185 190 

Ala Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr Gly 
195 200 205 

Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly Tyr 
210 215 220 

Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly 
225 230 235 240 
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Ala Tyr Met Ser Lys Ala His Gly lie Asp Pro Asn lie Arg Thr Gly 
245 250 255 

Val Arg Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr Tyr Gly 
260 265 270 

Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He He 
275 280 285 

lie Cys Asp Glu Cys His Ser Thr Asp Ser Thr Thr He Leu Gly He 
290 295 300 

Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val 
305 310 315 320 

Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn 
325 330 335 

He Glu Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe Tyr Gly 
340 345 350 

Lys Ala He Pro He Glu Ala He Arg Gly Gly Arg His Leu He Phe 
355 360 365 

Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Ser Gly 
370 375 380 

Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val 
385 390 395 400 

He Pro Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met 
405 410 415 

Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys 
420 425 430 

Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu 
435 440 445 

Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly 
450 455 460 

Arg Thr Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr Pro Gly 
465 470 475 480 

Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr 
485 490 495 

Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val 
500 505 510 

Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp 
515 520 525 

His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His He Asp 
530 535 540 
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Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr 
545 550 555 560 

Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro 
565 570 575 

Pro Ser Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys Pro Thr 
580 585 590 

Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn 
595 600 605 

Glu Val Thr Leu Thr His Pro lie Thr Lys Tyr lie Met Ala Cys Met 
610 615 620 



Ser Ala Asp Leu Glu Val Val Thr 
625' 630 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr 
1 5 10 15 

Cys Leu Thr Thr Gly Ser Val Val lie Val Gly Arg He He Leu Ser 
20 25 30 

Gly Arg Pro Ala He Val Pro Asp Arg Glu Leu Leu Tyr Gin Glu Phe 
35 40 45 

Asp Glu Met Glu Glu Cys 
50 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 

Asp Thr Glu Asp Val Val Cys Cys Ser Met Tyr Thr Trp Thr Gly Lys 
15 10 15 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 78 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
GATATACATA TGGGTTCTGT TGTTATTGTT GGTAGAATTA TTTTATCTGG TAGTGGTAGT 60 
ATCACGGCCT ACTCCCAA 78 
(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: 
CTCAGCGAAT TCTCAAGACC GCATAGTAGT TTCCAT 36 
(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
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CGGGGCCTAC TTGGTTGCAA GATCACTAGC CTTACAGGC 39 
(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
GCCTGTAAGG CTAGTGATCT TGCAACCAAG TAGGCCCCG 39 
(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
CGGGGCCTAC TTGGTTGCAT CAAGACTAGC CTTACAGGC 39 
(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
GCCTGTAAGG CTAGTCTTGA TGCAACCAAG TAGGCCCCG 39 
(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
CGGGGCCTAC TTGGTTGCAA GAAGACTAGC CTTACAGGC 39 
(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
GCCTGTAAGG CTAGTCTTCT TGCAACCAAG TAGGCCCCG 39 
(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
CTCCTACTTG AAGGGCTCTG CTGGTGGTCC ACTGCTCTGC 40 
(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
GCAGAGCAGT GGACCACCAG CAGAGCCCTT CAAGTAGGAG 40 
(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
CGGGGCCTAC TTGGTTGCAA GATCACTAGC CTTACAGGC 39 
(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
GCCTGTAAGG CTAGTGATCT TGCAACCAAG TAGGCCCCG 39 
(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION:' SEQ ID NO: 38: 
CGGGGCCTAC TTGGTTGCAT CAAGACTAGC CTTACAGGC 39 
(2) INFORMATION FOR SEQ ID NO: 39: 
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<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
GCCTGTAAGG CTAGTCTTGA TGCAACCAAG TAGGCCCCG 39 
(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
CGGGGCCTAC TTGGTTGCAA GAAGACTAGC CTTACAGGC 39 
(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 
GCCTGTAAGG CTAGTCTTGA TGCAACCAAG TAGGCCCCG 39 
(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 78 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 
GAT AT AC ATA TGGGTTCTGT TGTTATTGTT GGTAGAATTA TTTTATCTCC TGCTGGTGGT 60 
ATCACGGCCT ACTCCCAA 78 
(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 
CTCAGCGAAT TCTCAAGACC GCATAGTAGT TTCCAT 36 
(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 
CGGGGCCTAC TTGGTTGCAA GATCACTAGC CTTACAGGC 39 
(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 
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GCCTGTAAGG CTAGTGATCT TGCAACCAAG TAGGCCCCG 



39 



(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 216 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

Arg Gly Ser His Met Gly Ser Val Val lie Val Gly Arg He He Leu 
20 25 30 

Ser Pro Ala Gly He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu 
35 40 45 

Gly Cys He He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu 
50 55 60 

Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr 
65 70 75 80 

Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys 



Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn Val 
100 105 110 

Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu 
115 120 125 

Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His 
130 135 140 

Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu 
145 150 155 160 

Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro 
165 170 175 

Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala Val 
180 185 190 

Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser 
195 200 205 

Met Glu Thr Thr Met Arg Ser * 
210 215 



85 



90 



95 
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(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
Pro Ala Gly 



(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 75 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:'48: 
GATATACATA TGGGTTCTGT TGTTATTGTT GGTAGAATTA TTTTATCTCC TGCTGGTATC 60 
ACGGCCTACT CCCAA 75 
(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 
CTCAGCGAAT TCTCAAGACC GCATAGTAGT TTCCAT 3 6 



(2) INFORMATION FOR SEQ ID NO: 50: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 213 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

Ser Pro Ala Gly He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu 
35 40 45 

Gly Cys Lys He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu 
50 55 60 

Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr 
65 70 75 80 

Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys 
85 90 95 

Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn Val 
100 105 110 

Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu 
115 120 . 125 

Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His 
130 135 140 

Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu 
145 150 155 160 

Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro 
165 170 175 

Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala Val 
180 185 190 

Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser 
195 200 205 

Met Glu Thr Thr Met 
210 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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{ ii ) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 
CGGGGCCTAC TTGGTTGCAA GATCACTAGC CTTACAGGC 39 
(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:52: 
GCCTGTAAGG CTAGTGATCT TGCAACCAAG TAGGCCCCG 39 



(2) INFORMATION FOR SEQ ID NO : 53 : 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 166 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

Met lie Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly Glu 
15 10 15 

Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys Val 
20 25 30 

Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys Thr Leu 
35 40 45 

Ala Gly Pro Lys Gly ' Pro lie Thr Gin Met Tyr Thr Asn Val Asp Gin 
50 55 60 

Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu Thr Pro 
65 70 75 80 

Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala Asp 
85 90 95 
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Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu Ser 
100 105 110 

Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu Leu 
115 120 125 

Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala Val Cys Thr 
130 135 140 

Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser Met Glu 
145 150 155 160 

Thr Thr Met Arg Ser * 
165 

(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 3 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

Gly Gly Ser 
1 

(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 75 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 
GATATACATA TGGGTTCTGT TGTTATTGTT GGTAGAATTA TTTTATCTGG TGGTTCTATC 60 
ACGGCCTACT CCCAA 75 
(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 
CTCAGCGAAT TCTCAAGACC GCATAGTAGT TTCCAT 36 

(2) INFORMATION FOR SEQ ID NO:57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 216 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:57: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
1 5 . 10 15 

Arg Gly Ser His Met Gly Ser Val Val lie Val Gly Arg He He Leu 
20 25 30 

Ser Gly Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu 
35 40 45 

Gly Cys Lys He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu 
50 55 60 

Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr 
65 70 75 80 

Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys 
85 90 95 

Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn Val 
100 105 110 

Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu 
115 120 125 

Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His 
130 135 140 

Ala Asp Val lie Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu 
145 150 155 160 

Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro 
165 170 175 

Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala Val 
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180 185 190 

Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser 
195 200 205 

Met Glu Thr Thr Met Arg Ser * 
210 215 

(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 
CGGGGCCTAC TTGGTTGCAA GATCACTAGC CTTACAGGC 39 
(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 
GCCTGTAAGG CTAGTGATCT TGCAACCAAG TAGGCCCCG 39 



(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 668 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

Met His Met His His His His His His Leu Val Pro Arg Gly Ser Ala 
15 10 15 

Pro lie Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly Cys Lys 
20 25 30 
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He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly Glu Val 
35 40 45 

Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys Val Asn 
50 55 60 

Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys Thr Leu Ala 
65 70 75 80 

Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn Val Asp Gin Asp 
85 90 95 

Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu Thr Pro Cys 
100 105 110 

Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala Asp Val 
115 120 125 



He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly. Ser Leu Leu Ser Pro 
130 135 140 



Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu Leu Cys 
145 150 155 160 

Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala Val Cys Thr Arg 
165 170 175 

Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser Met Glu Thr 
180 185 190 



Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro Ala Val 
195 200 205 

Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr Gly Ser Gly 
210 215 220 



Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val 
225 230 235 240 

Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr 
245 250 255 

Met Ser Lys Ala His Gly He Asp Pro Asn He Arg Thr Gly Val Arg 
260 265 270 

Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr Tyr Gly Lys Phe 
275 280 285 

Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He He He Cys 
290 295 300 



Asp Glu Cys His Ser Thr Asp Ser Thr Thr He Leu Gly He Gly Thr 
305 310 315 320 

Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala 
325 330 335 

Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn He Glu 
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340 



345 



350 



Glu Val Ala Leu Ser Asn Thr Gly Glu lie Pro Phe Tyr Gly Lys Ala 
355 360 365 

He Pro He Glu Ala He Arg Gly Gly Arg His Leu He Phe Cys His 
370 375 380 

Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Ser Gly Leu Gly 
385 390 395 400 

He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro 
405 410 415 

Thr lie Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly 
420 425 430 

Tyr Thr Gly Asp Phe Asp Ser Val lie Asp Cys Asn Thr Cys Val Thr 
435 440 445 

Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr lie Glu Thr Thr 
450 • 455 460 

Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly Arg Thr 
465 470 475 480 

Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr Pro Gly Glu Arg 
485 490 495 

Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala 
500 505 510 

Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val Arg Leu 
515 520 525 

Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu 
530 535 540 

Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His lie Asp Ala His 
545 550 555 560 

Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr Leu Val 
565 570 575 

Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser 
580 585 590 

Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys Pro Thr Leu His 
595 600 605 

Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu Val 
610 615 620 

Thr Leu Thr His Pro lie Thr Lys Tyr lie Met Ala Cys Met Ser Ala 
625 630 635 640 



Asp Leu Glu Val Val Thr * Glu Phe Glu Leu Arg Arg Gin Ala Cys 
645 650 655 
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Gly Arg Thr Arg Ala Pro Pro Pro Pro Pro Leu Arg 
660 665 

(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

CGGGGCCTAC TTGGTTGCAA GATCACTAGC CTTACAGGC 39 

(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 
{A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 
GCCTGTAAGG CTAGTGATCT TGCAACCAAG TAGGCCCCG 39 



(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 668 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

Met His Met His His His His His His Leu Val Pro Arg Gly Ser Ala 
15 10 15 

Pro He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly Cys He 
20 25 30 

Lys Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly Glu Val 
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35 



40 



45 



Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys Val Asn 
50 55 60 

Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys Thr Leu Ala 
65 70 7.5 ' 80 

Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn Val Asp Gin Asp 
85 90 95 

Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu Thr Pro Cys 
100 105 110 

Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala Asp Val 
115 120 125 

He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu Ser Pro 
130 135 140 

Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu Leu Cys 
145 150 155 160 

Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala Val Cys Thr Arg 
165 170 175 

Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser Met Glu Thr 
180 185 190 

Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro Ala Val 
195 200 205 

Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr Gly Ser Gly 
210 215 220 

Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val 
225 230 235 240 

Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr 
245 250 255 

Met Ser Lys Ala His Gly He Asp Pro Asn He Arg Thr Gly Val Arg 
260 265 270 

Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr Tyr Gly Lys Phe 
275 280 285 

Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He He He Cys 
290 295 300 

Asp Glu Cys His Ser Thr Asp Ser Thr Thr He Leu Gly He Gly Thr 
305 310 315 320 

Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala 
325 330 335 

Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn He Glu 



340 



345 



350 
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Glu Val Ala Leu Ser Asn Thr Gly Glu lie Pro Phe Tyr Gly Lys Ala 
355 360 365 

lie Pro He Glu Ala He Arg Gly Gly Arg His Leu He Phe Cys His 
370 375 380 

Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Ser Gly Leu Gly 
385 390 395 400 

He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro 
405 410 415 

Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly 
420 425 430 

Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val Thr 
435 440 445 

Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr Thr 
450 455 460 

Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly Arg Thr 
465 470 475 480 

Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr Pro Gly Glu Arg 
485 490 495 

Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala 
500 505 510 

Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val Arg Leu 
515 520 525 

Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu 
530 535 540 

Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His He Asp Ala His 
545 550 555 560 

Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr Leu Val 
565 570 575 

Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser 
580 585 590 

Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro Thr Leu His 
595 600 605 

Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu Val 
610 615 620 

Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala Cys Met Ser Ala 
625 630 635 640 

Asp Leu Glu Val Val Thr * 
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(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
{D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 
CGGGGCCTAC TTGGTTGCAT CAAGACTAGC CTTACAGGC 39 
(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 
GCCTGTAAGG CTAGTCTTGA TGCAACCAAG TAGGCCCCG 39 



(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 668 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

Met His Met His His His His His His Leu Val Pro Arg Gly Ser Ala 
15 10 15 

Pro lie Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly Cys lie 
20 25 30 

lie Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly Glu Val 
35 40 45 

Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys Val Asn 
50 55 60 
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Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys Thr Leu Ala 
65 70 75 80 

Gly Pro Lys Gly Pro lie Thr Gin Met Tyr Thr Asn Val Asp Gin Asp 
85 90 95 

Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu Thr Pro Cys 
100 105 110 

Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala Asp Val 
115 120 125 

lie Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu Ser Pro 
130 135 140 

Arg Pro Val Ser Tyr Leu Lys Gly Ser Ala Gly Gly Pro Leu Leu Cys 
145 150 155 160 

Pro Ser Gly His Ala Val Gly lie Phe Arg Ala Ala Val Cys Thr Arg 
165 170 175 

Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser Met Glu Thr 
180 185 190 

Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro Ala Val 
195 200 205 

Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr Gly Ser Gly 
210 215 220 

Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val 
225 230 235 240 

Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr 
245 250 255 

Met Ser Lys Ala His Gly lie Asp Pro Asn lie Arg Thr Gly Val Arg 
260 265 270 

Thr lie Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr Tyr Gly Lys Phe 
275 280 285 

Leu Ala Asp Gly .Gly Cys Ser Gly Gly Ala Tyr Asp He He He Cys 
290 295 300 

Asp Glu Cys His Ser Thr Asp Ser Thr Thr He Leu Gly He Gly Thr 
305 310 315 320 

Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala 
325 330 335 

Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn He Glu 
340 345 350 



Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe Tyr Gly Lys Ala 
355 360 365 
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lie Pro lie Glu Ala He Arg Gly Gly Arg His Leu He Phe Cys His 
370 375 380 

Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Ser Gly Leu Gly 
385 390 395 400 

He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro 
405 410 415 

Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly 
420 425 430 

Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val Thr 
435 440 445 

Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr Thr 
450 455 460 

Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly Arg Thr 
465 470 475 480 

Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr Pro Gly Glu Arg 
485 490 495 

Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala 
500 505 510 

Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val Arg Leu 
515 520 525 

Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu 
530 535 540 

Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His He Asp Ala His 
545 550 555 560 

Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr Leu Val 
565 570 575 

Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser 
580 585 590 

Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro Thr Leu His 
595 600 605 

Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu Val 
610 615 620 

Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala Cys Met Ser Ala 
625 630 635 640 

Asp Leu Glu Val Val Thr * 



(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 40 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 
CTCCTACTTG AAGGGCTCTG CTGGTGGTCC ACTGCTCTGC 40 
(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 
GCAGAGCAGT GGACCACCAG CAGAGCCCTT CAAGTAGGAG 40 



(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 668 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:69: 

Met His Met His His His His His His Leu Val Pro Arg Gly Ser Ala 
15 10 15 

Pro lie Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly Cys lie 
20 25 30 

He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly Glu Val 
35 40 45 

Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys Val Asn 
50 55 60 

Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys Thr Leu Ala 
65 70 75 80 
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Gly Pro Lys Gly Pro lie Thr Gin Met Tyr Thr Asn Val Asp Gin Asp 
85 90 95 

Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu Thr Pro Cys 
100 105 110 

Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala Asp Val 
115 120 125 

lie Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu Ser Pro 
130 135 140 

Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu Leu Cys 
145 150 155 160 

Pro Ser Gly His Ala Val Gly lie Phe Arg Ala Ala Val Cys Thr Arg 
165 170 175 

Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser Met Glu Thr 
180 185 190 

Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro Ala Val 
195 200 205 

Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr Gly Ser Gly 
210 215 220 

Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val 
225 230 235 240 

Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr 
245 250 255 

Met Ser Lys Ala His Gly He Asp Pro Asn He Arg Thr Gly Val Arg 
260 265 270 

Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr Tyr Gly Lys Phe 
275 280 285 

Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He He lie Cys 
290 295 300 

Asp Glu Cys His Ser Thr Asp Ser Thr Thr He Leu Gly He Gly Thr 
305 310 315 320 

Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala 
325 330 335 

Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn He Glu 
340 345 350 

Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe Tyr Gly Lys Ala 
355 360 365 

He Pro He Glu Ala He Arg Gly Gly Arg His Leu He Phe Cys His 
370 375 380 



Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Ser Gly Leu Gly 
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385 390 395 400 

He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro 
405 410 415 

Thr Ser Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly 
420 425 430 

Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val Thr 
435 440 445 

Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr Thr 
450 455 460 

Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly Arg Thr 
465 470 475 480 

Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr Pro Gly Glu Arg 
485 490 495 

Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala 
500 505 510 

Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val Arg Leu 
515 520 525 

Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu 
530 535 540 

Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His He Asp Ala His 
545 550 555 560 

Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr Leu Val 
565 570 575 

Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser 
580 585 590 

Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro Thr Leu His 
595 600 605 

Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu Val 
610 615 620 

Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala Cys Met Ser Ala 
625 630 635 640 

Asp Leu Glu Val Val Thr * Glu Phe Glu Leu Arg Arg Gin Ala Cys 
645 650 655 

Gly Arg Thr Arg Ala Pro Pro Pro Pro Pro Leu Arg 
660 665 

(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 5 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 
GTCCGTCATA CCAACTTCCG GAGACGTCGT TGTCG 35 
(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 
CGACAACGAC GTCTCCGGAA GTTGGTATGA CGGAC 3 5 



(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 669 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

Met His Met His His His His His His Leu Val Pro Arg Gly Ser Ala 
15 10 15 

Pro He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly Cys He 
20 25 30 

He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly Glu Val 
35 40 45 

Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys Val Asn 
50 55 60 

Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys Thr Leu Ala 
65 70 75 80 

Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn Val Asp Gin Asp 



SUBSTITUTE SHEET ( rule 26 ) 



WO 99/28482 



64 



PCT/US98/24528 



85 



90 



95 



Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu Thr Pro Cys 
100 105 110 

Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala Asp Val 
115 120 125 

lie Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu Ser Pro 
130 135 140 

Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu Leu Cys 
145 150 155 160 

Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala Val Cys Thr Arg 
165 170 175 

Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser Met Glu Thr 
180 185 190 

Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro Ala Val 
195 200 205 

Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr Gly Ser Gly 
210 215 220 

Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val 
225 230 235 240 

Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr 
245 250 255 

Met Ser Lys Ala His Gly He Asp Pro Asn He Arg Thr Gly Val Arg 
260 265 270 

Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr Tyr Gly Lys Phe 
275 280 285 

Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He He He Cys 
290 295 300 

Asp Glu Cys His Ser Thr Asp Ser Thr Thr He Leu Gly He Gly Thr 
305 310 315 320 

Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala 
325 330 335 

Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn He Glu 
340 345 350 

Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe Tyr Gly Lys Ala 
355 360 365 

He Pro He Glu Ala He Arg Gly Gly Arg His Leu He Phe Cys His 
370 375 380 



Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Ser Gly Leu Gly 
385 390 395 400 
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lie Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro 
405 410 415 

Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly 
420 425 430 

Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val Thr 
435 440 445 

Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr Thr 
450 455 460 

Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly Arg Thr 
465 470 475 480 

Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr Pro Gly Glu Arg 
485 490 495 

Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala 
500 505 510 

Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val Arg Leu 
515 520. 525 

Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu 
530 535 540 

Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His He Asp Ala His 
545 550 555 560 

Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr Leu Val 
565 570 575 

Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser 
580 585 590 

Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro Thr Leu His 
595 600 605 

Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu Val 
610 615 620 

Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala Cys Met Ser Ala 
625 630 635 640 

Asp Leu Glu Val Val Thr * 



(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:73: 
ACTAAAGTGC CGGCTGCCTA CGCAGCCCAA GGG 33 
(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 
CCCTTGGGCT GCGTAGGCAG CCGGCACTTT AGT 33 
(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 
CGGGGCCTAC TTGGTTGCAA GATCACTAGC CTTACAGGC 3 9 

(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 
GCCTGTAAGG CTAGTGATCT TGCAACCAAG TAGGCCCCG 39 
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(2) INFORMATION FOR SEQ ID NO: 77: 

U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 
CGGGGCCTAC TTGGTTGCAT CAAGACTAGC CTTACAGGC 39 
(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 
GCCTGTAAGG CTAGTCTTGA TGCAACCAAG TAGGCCCCG 39 
(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 
CGGGGCCTAC TTGGTTGCAA GAAGACTAGC CTTACAGG 38 
(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 



SUBSTITUTE SHEET ( rule 26 ) 



WO 99/28482 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 



68 



PCT/US98/24528 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 
GCCTGTAAGG CTAGTCTTCT TGCAACCAAG TAGGCCCCG 
(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 
CGGGGCCTAC TTGGTTGCAA GATCACTAGC CTTACAGGC 39 
(2) INFORMATION FOR SEQ ID NO: 82: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 
GCCTGTAAGG CTAGTGATCT TGCAACCAAG TAGGCCCCG 3 9 

(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 
CGGGGCCTAC TTGGTTGCAT CAAGACTAGC CTTACAGGC 39 
(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 
GCCTGTAAGG CTAGTCTTGA TGCAACCAAG TAGGCCCCG 39 
(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 
CGGGGCCTAC TTGGTTGCAA GAAGACTAGC CTTACAGGC 39 
(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 
GCCTGTAAGG CTAGTCTTCT TGCAACCAAG TAGGCCCCG 39 
(2) INFORMATION FOR SEQ ID NO: 87: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 96 base pairs 

<B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 
GATATACATA TGGCTTACTC TCTGACTACG GGTTCTGTTG TTATTGTTGG TAGAATTATT 60 
TTATCTGGTA GTGGTAGTAT CACGGCCTAC TCCCAA 96 
(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 
GTGGTGGTGC TCGAGGCTGC CGCGCGGCAC CAGCGTAACG ACCTCCAGGT C 51 
(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 
GATATACATA TGGCTTACTC TCTGACTACG GGTTCTGTTG TTATTGTTGG TAGAATTATT 60 
TTATCTGGTA GTGGTAGTAT CACGGCCTAC TCCCAA 96 
(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 



(C) 
(D) 



STRANDEDNESS: single 
TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 
TGGTGGTGCT CGAGGCTGCC GCGCGGCACC AGCGTAACGA CCTCCAGGTC 50 
(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 

Asp Thr Glu Asp Val Val Ala Cys Ser Met Ser Tyr Thr Trp Tyr Gly 
15 10 15 

Lys 



(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 651 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION: 1..651 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 

ATG GGC AGC AGC CAT CAT CAT CAT CAT CAC AGC AGC GGC CTG GTG CCG 48 
Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

CGC GGC AGC CAT ATG GGT TCT GTT GTT ATT GTT GGT AGA ATT ATT TTA 96 
Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 
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TCT GGT AGT GGT AGT ATC ACG GCC TAC TCC CAA CAG ACG CGG GGC CTA 144 
Ser Gly Ser Gly Ser lie Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

CTT GGT TGC ATC ATC ACT AGC CTT ACA GGC CGG GAC AAG AAC CAG GTC 192 
Leu Gly Cys He He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

GAG GGA GAG GTT CAG GTG GTT TCC ACC GCA ACA CAA TCC TTC CTG GCG 240 
Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

ACC TGC GTC AAC GGC GTG TGT TGG ACC GTT TAC CAT GGT GCT GGC TCA 288 
Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

AAG ACC TTA GCC GGC CCA AAG GGG CCA ATC ACC CAG ATG TAC ACT AAT 336 
Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn 
100 105 110 

GTG GAC CAG GAC CTC GTC GGC TGG CAG GCG CCC CCC GGG GCG CGT TCC 384 
Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

TTG ACA CCA TGC ACC TGT GGC AGC TCA GAC CTT TAC TTG GTC ACG AGA 432 
Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 

CAT GCT GAC GTC ATT CCG GTG CGC CGG CGG GGC GAC AGT AGG GGG AGC 480 
His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 

CTG CTC TCC CCC AGG CCT GTC TCC TAC TTG AAG GGC TCT TCG GGT GGT 528 
Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly 
165 170 175 

CCA CTG CTC TGC CCT TCG GGG CAC GCT GTG GGC ATC TTC CGG GCT GCC 57 6 

Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala 
180 185 190 

GTA TGC ACC CGG GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTA GAG 624 
Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
195 200 205 

TCC ATG GAA ACT ACT ATG CGG TCT TGA 651 
Ser Met Glu Thr Thr Met Arg Ser * 
210 215 



(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 651 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..651 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:93: 

ATG GGC AGC AGC CAT CAT CAT CAT CAT CAC AGC AGC GGC CTG GTG CCG 48 
Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
1 5 10 15 

CGC GGC AGC CAT ATG GGT TCT GTT GTT ATT GTT GGT AGA ATT ATT TTA 96 
Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

TCT GGT AGT GGT AGT ATC ACG GCC TAC TCC CAA CAG ACG CGG GGC CTA 144 
Ser Gly Ser Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

CTT GGT TGC AAG ATC ACT AGC CTT ACA GGC CGG GAC AAG AAC CAG GTC 192 
Leu Gly Cys Lys He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

GAG GGA GAG GTT CAG GTG GTT TCC ACC GCA ACA CAA TCC TTC CTG GCG 240 
Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

ACC TGC GTC AAC GGC GTG TGT TGG ACC GTT TAC CAT GGT GCT GGC TCA 288 
Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

AAG ACC TTA GCC GGC CCA AAG GGG CCA ATC ACC CAG ATG TAC ACT AAT 3 36 

Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn 
100 105 110 

GTG GAC CAG GAC CTC GTC GGC TGG CAG GCG CCC CCC GGG GCG CGT TCC 384 
Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

TTG ACA CCA TGC ACC TGT GGC AGC TCA GAC CTT TAC TTG GTC ACG AGA 432 
Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 

CAT GCT GAC GTC ATT CCG GTG CGC CGG CGG GGC GAC AGT AGG GGG AGC 4 80 

His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 

CTG CTC TCC CCC AGG CCT GTC TCC TAC TTG AAG GGC TCT TCG GGT GGT 528 
Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly 
165 ^ 170 ' 175 

CCA CTG CTC TGC CCT TCG GGG CAC GCT GTG GGC ATC TTC CGG GCT GCC 576 
Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala 
180 185 190 

GTA TGC ACC CGG GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTA GAG 624 
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Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 

195 200 205 

TCC ATG GAA ACT ACT ATG CGG TCT TGA 

Ser Met Glu Thr Thr Met Arg Ser * 
210 215 



(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 651 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/ KEY : CDS 

(B) LOCATION: 1 . . 651 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 

ATG GGC AGC AGC CAT . CAT CAT CAT CAT CAC AGC AGC GGC CTG GTG CCG 48 
Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

CGC GGC AGC CAT ATG GGT TCT GTT GTT ATT GTT GGT AGA ATT ATT TTA 96 
Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

TCT GGT AGT GGT AGT ATC ACG GCC TAC TCC CAA CAG ACG CGG GGC CTA 144 
Ser Gly Ser Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

CTT GGT TGC ATC AAG ACT AGC CTT ACA GGC CGG GAC AAG AAC CAG GTC 192 
Leu Gly Cys He Lys Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

GAG GGA GAG GTT CAG GTG GTT TCC ACC GCA ACA CAA TCC TTC CTG GCG 240 
Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

ACC TGC GTC AAC GGC GTG TGT TGG ACC GTT TAC CAT GGT GCT GGC TCA 288 
Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

AAG ACC TTA GCC GGC CCA AAG GGG CCA ATC ACC CAG ATG TAC ACT AAT 336 
Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn 
100 105 110 

GTG GAC CAG GAC CTC GTC GGC TGG CAG GCG CCC CCC GGG GCG CGT TCC 384 
Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

TTG ACA CCA TGC ACC TGT GGC AGC TCA GAC CTT TAC TTG GTC ACG AGA 432 
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Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 

CAT GCT GAC GTC ATT CCG GTG CGC CGG CGG GGC GAC AGT AGG GGG AGC 480 
His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 

CTG CTC TCC CCC AGG CCT GTC TCC TAC TTG AAG GGC TCT TCG GGT GGT 528 
Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly 
165 170 175 

CCA CTG CTC TGC CCT TCG GGG CAC GCT GTG GGC ATC TTC CGG GCT GCC 576 
Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala 
180 185 190 

GTA TGC ACC CGG GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTA GAG 624 
Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
195 200 205 

TCC ATG GAA ACT ACT ATG CGG TCT TGA 651 
Ser Met Glu Thr Thr Met Arg Ser * 
210 215 



(2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 651 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1..651 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 

ATG GGC AGC AGC CAT CAT CAT CAT CAT CAC AGC AGC GGC CTG GTG CCG 48 
Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

CGC GGC AGC CAT ATG GGT TCT GTT GTT ATT GTT GGT AGA ATT ATT TTA 96 
Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

TCT GGT AGT GGT AGT ATC ACG GCC TAC TCC CAA CAG ACG CGG GGC CTA 144 
Ser Gly Ser Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

CTT GGT TGC AAG AAG ACT AGC CTT AC A GGC CGG GAC AAG AAC CAG GTC 192 
Leu Gly Cys Lys Lys Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 
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GAG GGA GAG GTT CAG GTG GTT TCC ACC GCA ACA CAA TCC TTC CTG GCG 240 
Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

ACC TGC GTC AAC GGC GTG TGT TGG ACC GTT TAC CAT GGT GCT GGC TCA 288 
Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

AAG ACC TTA GCC GGC CCA AAG GGG CCA ATC ACC CAG ATG TAC ACT AAT 336 
Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn 
100 105 110 

GTG GAC CAG GAC CTC GTC GGC TGG CAG GCG CCC CCC GGG GCG CGT TCC 384 
Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

TTG ACA CCA TGC ACC TGT GGC AGC TCA GAC CTT TAC TTG GTC ACG AGA 432 
Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 

CAT GCT GAC GTC ATT CCG GTG CGC CGG CGG GGC GAC AGT AGG GGG AGC 480 
His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 

CTG CTC TCC CCC AGG CCT GTC TCC TAC TTG AAG GGC TCT TCG GGT GGT 528 
Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly 
165 170 175 

CCA CTG CTC TGC CCT TCG GGG CAC GCT GTG GGC ATC TTC CGG GCT GCC 576 
Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala 
180 185 190 

GTA TGC ACC CGG GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTA GAG 624 
Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
195 200 205 

TCC ATG GAA ACT ACT ATG CGG TCT TGA 651 
Ser Met Glu Thr Thr Met Arg Ser * 
210 215 



(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 50 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..650 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:96: 

ATG GGC AGC AGC CAT CAT CAT CAT CAT CAC AGC AGC GGC CTG GTG CCG 48 
Met Gly Ser Ser His Kis His His His His Ser Ser Gly Leu Val Pro 
1 5 10 15 

CGC GGC AGC CAT ATG GGT TCT GTT GTT ATT GTT GGT AGA ATT ATT TTA 96 
Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

TCT GGT AGT GGT AGT ATC ACG GCC TAC TCC CAA CAG ACG CGG GGC CTA 144 
Ser Gly Ser Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

CTT GGT TGC ATC ATC ACT AGC CTT ACA GGC CGG GAC AAG AAC CAG GTC 192 
Leu Gly Cys He He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

GAG GGA GAG GTT CAG GTG GTT TCC ACC GCA ACA CAA TCC TTC CTG GCG 240 
Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

ACC TGC GTC AAC GGC GTG TGT TGG ACC GTT TAC CAT GGT GCT GGC TCA 288 
Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

AAG ACC TTA GCC GGC CCA AAG GGG CCA ATC ACC CAG ATG TAC ACT AAT 336 
Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn 
100 105 110 

GTG GAC CAG GAC CTC GTC GGC TGG CAG GCG CCC CCC GGG GCG CGT TCC 384 
Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

TTG ACA CCA TGC ACC TGT GGC AGC TCA GAC CTT TAC TTG GTC ACG AGA 432 
Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 

CAT GCT GAC GTC ATT CCG GTG CGC CGG CGG GGC GAC AGT AGG GGG AGC 480 
His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 

CTG CTC TCC CCC AGG CCT GTC TCC TAC TTG AAG GGC TCT GCT GGT GGT 528 
Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ala Gly Gly 
165 170 175 

CCA CTG CTC TGC CCT TCG GGG CAC GCT GTG GGC ATC TTC CGG GCT GCC 576 
Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala 
180 185 190 

GTA TGC ACC CGG GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTA GAG 624 
Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
195 200 205 

TCC ATG GAA ACT ACT ATG CGG TCT TG 650 
Ser Met Glu Thr Thr Met Arg Ser 
210 215 
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(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 650 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION: 1. .650 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 

ATG GGC AGC AGC CAT CAT CAT CAT CAT CAC AGC AGC GGC CTG GTG CCG 48 
Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

CGC GGC AGC CAT ATG GGT TCT GTT GTT ATT GTT GGT AGA ATT ATT TTA 96 
Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

TCT GGT AGT GGT AGT ATC ACG GCC TAC TCC CAA CAG ACG CGG GGC CTA 144 
Ser Gly Ser Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

CTT GGT TGC AAG ATC ACT AGC CTT ACA GGC CGG GAC AAG AAC CAG GTC 192 
Leu Gly Cys Lys He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

GAG GGA GAG GTT CAG GTG GTT TCC ACC GCA ACA CAA TCC TTC CTG GCG 240 
Glu Gly Glu Val Gin Val Val Ser. Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

ACC TGC GTC AAC GGC GTG TGT TGG ACC GTT TAC CAT GGT GCT GGC TCA 2 88 

Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

AAG ACC TTA GCC GGC CCA AAG GGG CCA ATC ACC CAG ATG TAC ACT AAT 336 
Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn 
100 105 110 

GTG GAC CAG GAC CTC GTC GGC TGG CAG GCG CCC CCC GGG GCG CGT TCC 384 
Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

TTG ACA CCA TGC ACC TGT GGC AGC TCA GAC CTT TAC TTG GTC ACG AGA 432 
Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 

CAT GCT GAC GTC ATT CCG GTG CGC CGG CGG GGC GAC AGT AGG GGG AGC 480 
His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 
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CTG CTC TCC CCC AGG CCT GTC TCC TAC TTG AAG GGC TCT GCT GGT GGT 
Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ala Gly Gly 
165 170 175 



528 



CCA CTG CTC TGC CCT TCG GGG CAC GCT GTG GGC ATC TTC CGG GCT GCC 
Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala 
180 185 190 



576 



GTA TGC ACC CGG GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTA GAG 
Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
195 200 205 



624 



TCC ATG GAA ACT ACT ATG CGG TCT 
Ser Met Glu Thr Thr Met Arg Ser 
210 215 



TG 



650 



(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 651 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: cDNA 



<ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1..651 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

ATG GGC AGC AGC CAT CAT CAT CAT CAT CAC AGC AGC GGC CTG GTG CCG 48 
Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

CGC GGC AGC CAT ATG GGT TCT GTT GTT ATT GTT GGT AGA ATT ATT TTA 96 
Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

TCT GGT AGT GGT AGT ATC ACG GCC TAC TCC CAA CAG ACG CGG GGC CTA 144 
Ser Gly Ser Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

CTT GGT TGC ATC AAG ACT AGC CTT AC A GGC CGG GAC AAG AAC CAG GTC 192 
Leu Gly Cys He Lys Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

GAG GGA GAG GTT CAG GTG GTT TCC ACC GCA ACA CAA TCC TTC CTG GCG 240 
Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

ACC TGC GTC AAC GGC GTG TGT TGG ACC GTT TAC CAT GGT GCT GGC TCA 288 
Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
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85 90 95 

AAG ACC TTA GCC GGC CCA AAG GGG CCA ATC ACC CAG ATG TAC ACT AAT 336 
Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn 
100 105 110 

GTG GAC CAG GAC CTC GTC GGC TGG CAG GCG CCC CCC GGG GCG CGT TCC 384 
Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

TTG ACA CCA TGC ACC TGT GGC AGC TCA GAC CTT TAC TTG GTC ACG AGA 432 
Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 

CAT GCT GAC GTC ATT CCG GTG CGC CGG CGG GGC GAC AGT AGG GGG AGC 480 
His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 ■ 155 160 

CTG CTC TCC CCC AGG CCT GTC TCC TAC TTG AAG GGC TCT GCT GGT GGT 528 
Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ala Gly Gly 
165 170 175 

CCA CTG CTC TGC CCT TCG GGG CAC GCT GTG GGC ATC TTC CGG GCT GCC 576 
Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala 
180 185 190 

GTA TGC ACC CGG GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTA GAG 624 
Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
195 200 205 

TCC ATG GAA ACT ACT ATG CGG TCT TGA 651 
Ser Met Glu Thr Thr Met Arg Ser * 
210 215 



(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 651 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION: 1. .651 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 

ATG GGC AGC AGC CAT CAT CAT CAT CAT CAC AGC AGC GGC CTG GTG CCG 48 
Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 
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CGC GGC AGC CAT ATG GGT TCT GTT GTT ATT GTT GGT AGA ATT ATT TTA 
Arg Gly Ser His Met Gly Ser Val Val lie Val Gly Arg He He Leu 
20 25 30 



96 



TCT GGT AGT GGT AGT ATC ACG GCC TAC TCC CAA CAG ACG CGG GGC CTA 
Ser Gly Ser Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 



144 



CTT GGT TGC AAG AAG ACT AGC CTT ACA GGC CGG GAC AAG AAC CAG GTC 
Leu Gly Cys Lys Lys Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 



192 



GAG GGA GAG GTT CAG GTG GTT TCC ACC GCA ACA CAA TCC TTC CTG GCG 
Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 



240 



ACC TGC GTC AAC GGC GTG TGT TGG ACC GTT TAC CAT GGT GCT GGC TCA 
Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 



288 



AAG ACC TTA GCC GGC CCA AAG GGG CCA ATC ACC CAG ATG TAC ACT AAT 
Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn 
100 105 110 



336 



GTG GAC CAG GAC CTC GTC GGC TGG CAG GCG CCC CCC GGG GCG CGT TCC 
Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 



384 



TTG ACA CCA TGC ACC TGT GGC AGC TCA GAC CTT TAC TTG GTC ACG AGA 
Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 



432 



CAT GCT GAC GTC ATT CCG GTG CGC CGG CGG GGC GAC AGT AGG GGG AGC 
His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 



480 



CTG CTC TCC CCC AGG CCT GTC TCC TAC TTG AAG GGC TCT GCT GGT GGT 
Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ala Gly Gly 
165 170 175 



528 



CCA CTG CTC TGC CCT TCG GGG CAC GCT GTG GGC ATC TTC CGG GCT GCC 
Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala 
180 185 190 



576 



GTA TGC ACC CGG GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTA GAG 
Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
195 200 205 



624 



TCC ATG GAA ACT ACT ATG CGG TCT TGA 
Ser Met Glu Thr Thr Met Arg Ser * 
210 215 



651 



(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 651 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/ KEY : CDS 

(B) LOCATION: 1 . .651 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 

ATG GGC AGC AGC CAT CAT CAT CAT CAT CAC AGC AGC GGC CTG GTG CCG 48 
Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

CGC GGC AGC CAT ATG GGT TCT GTT GTT ATT GTT GGT AGA ATT ATT TTA 96 
Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

TCT CCT GCT GGT GGT ATC ACG GCC TAC TCC CAA CAG ACG CGG GGC CTA 144 
Ser Pro Ala Gly Gly He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 .40 45 

CTT GGT TGC ATC ATC ACT AGC CTT ACA GGC CGG GAC AAG AAC CAG GTC 192 
Leu Gly Cys He lie .Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

GAG GGA GAG GTT CAG GTG GTT TCC ACC GCA ACA CAA TCC TTC CTG GCG 240 
Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

ACC TGC GTC AAC GGC GTG TGT TGG ACC GTT TAC CAT GGT GCT GGC TCA 288 
Thr. Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 " 90 95 

AAG ACC TTA GCC GGC CCA AAG GGG CCA ATC ACC CAG ATG TAC ACT AAT 336 
Lys Thr Leu Ala Gly Pro Lys Gly Pro lie Thr Gin Met Tyr Thr Asn 
100 105 110 

GTG GAC CAG GAC CTC GTC GGC TGG CAG GCG CCC CCC GGG GCG CGT TCC 384 
Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

TTG ACA CCA TGC ACC TGT GGC AGC TCA GAC CTT TAC TTG GTC ACG AGA 432 
Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 

CAT GCT GAC GTC ATT CCG GTG CGC CGG CGG GGC GAC AGT AGG GGG AGC 480 
His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 

CTG CTC TCC CCC AGG CCT GTC TCC TAC TTG AAG GGC TCT TCG GGT GGT 528 
Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly 
165 170 175 

CCA CTG CTC TGC CCT TCG GGG CAC GCT GTG GGC ATC TTC CGG GCT GCC 576 
Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala 
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180 185 190 

GTA TGC ACC CGG GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTA GAG 624 
Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
195 200 205 

TCC ATG GAA ACT ACT ATG CGG TCT TGA 651 
Ser Met Glu Thr Thr Met Arg Ser * 
210 215 



(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 651 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY : CDS 
<B) LOCATION: 1. .651 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

ATG GGC AGC AGC CAT CAT CAT CAT CAT CAC AGC AGC GGC CTG GTG CCG 48 
Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

CGC GGC AGC CAT ATG GGT TCT GTT GTT ATT GTT GGT AGA ATT ATT TTA 96 
Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

TCT CCT GCT GGT GGT ATC ACG GCC TAC TCC CAA CAG ACG CGG GGC CTA 144 
Ser Pro Ala Gly Gly He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

CTT GGT TGC AAG ATC ACT AGC CTT ACA GGC CGG GAC AAG AAC CAG GTC 192 
Leu Gly Cys Lys He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

GAG GGA GAG GTT CAG GTG GTT TCC ACC GCA ACA CAA TCC TTC CTG GCG 240 
Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

ACC TGC GTC AAC GGC GTG TGT TGG ACC GTT TAC CAT GGT GCT GGC TCA 2 88 

Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

AAG ACC TTA GCC GGC CCA AAG GGG CCA ATC ACC CAG ATG TAC ACT AAT 3 36 

Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn 
100 105 110 
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GTG GAC CAG GAC CTC GTC GGC TGG CAG GCG CCC CCC GGG GCG CGT TCC 384 
Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

TTG ACA CCA TGC ACC TGT GGC AGC TCA GAC CTT TAC TTG GTC ACG AGA 432 
Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 

CAT GCT GAC GTC ATT CCG GTG CGC CGG CGG GGC GAC AGT AGG GGG AGC 480 
His Ala Asp Val lie Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 

CTG CTC TCC CCC AGG CCT GTC TCC TAC TTG AAG GGC TCT TCG GGT GGT 528 
Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly 
165 170 175 

CCA CTG CTC TGC CCT TCG GGG CAC GCT GTG GGC ATC TTC CGG GCT GCC 576 
Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala 
180 185 190 

GTA TGC ACC CGG GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTA GAG 624 
Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
195 200 205 

TCC ATG GAA ACT ACT ATG CGG TCT TGA 651 
Ser Met Glu Thr Thr Met Arg Ser * 
210 215 



(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1998 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..1995 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 

ATG GGC AGC AGC CAT CAT CAT CAT CAT CAC AGC AGC GGC CTG GTG CCG 48 
Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

CGC GGC AGC CAT ATG GGT TCT GTT GTT ATT GTT GGT AGA ATT ATT TTA 96 
Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

TCT GGT AGT GGT AGT ATC ACG GCC TAC TCC CAA CAG ACG CGG GGC CTA 144 
Ser Gly Ser Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
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35 40 45 

CTT GGT TGC ATC ATC ACT AGC CTT AC A GGC CGG GAC AAG AAC CAG GTC 192 
Leu Gly Cys lie He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

GAG GGA GAG GTT CAG GTG GTT TCC ACC GCA AC A CAA TCC TTC CTG GCG 240 
Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

ACC TGC GTC AAC GGC GTG TGT TGG ACC GTT TAC CAT GGT GCT GGC TCA 288 
Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

AAG ACC TTA GCC GGC CCA AAG GGG CCA ATC ACC CAG ATG TAC ACT AAT 336 
Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn 
100 105 110 

GTG GAC CAG GAC CTC GTC GGC TGG CAG GCG CCC CCC GGG GCG CGT TCC 384 
Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

TTG AC A CCA TGC ACC TGT GGC AGC TCA GAC CTT TAC TTG GTC ACG AGA 432 
Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 

CAT GCT GAC GTC ATT CCG GTG CGC CGG CGG GGC GAC AGT AGG GGG AGC 480 
His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 

CTG CTC TCC CCC AGG CCT GTC TCC TAC TTG AAG GGC TCT TCG GGT GGT 528 
Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly 
165 170 175 

CCA CTG CTC TGC CCT TCG GGG CAC GCT GTG GGC ATC TTC CGG GCT GCC 576 
Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala 
180 185 190 

GTA TGC ACC CGG GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTA GAG 624 
Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
195 200 205 

TCC ATG GAA ACT ACT ATG CGG TCT CCG GTC TTC ACG GAC AAC TCA TCC 672 
Ser Met Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser 
210 215 220 

CCC CCG GCC GTA CCG CAG TCA TTT CAA GTG GCC CAC CTA CAC GCT CCC 720 
Pro Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro 
225 230 235 240 

ACT GGC AGC GGC AAG AGT ACT AAA GTG CCG GCT GCA TAT GCA GCC CAA 768 
Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin 
245 250 255 

GGG TAC AAG GTG CTC GTC CTC AAT CCG TCC GTT GCC GCT ACC TTA GGG 816 
Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly 
260 265 270 
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TTT GGG GCG TAT ATG TCT AAG GCA CAC GGT ATT GAC CCC AAC ATC AGA 864 
Phe Gly Ala Tyr Met Ser Lys Ala His Gly He Asp Pro Asn He Arg 
275 280 285 

ACT GGG GTA AGG ACC ATT ACC ACA GGC GCC CCC GTC ACA TAG TCT ACC 912 
Thr Gly Val Arg Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr 
290 295 300 

TAT GGC AAG TTT CTT GCC GAT GGT GGT TGC TCT GGG GGC GCT TAT GAC 960 
Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp 
305 310 315 320 

ATC ATA ATA TGT GAT GAG TGC CAT TCA ACT GAC TCG ACT ACA ATC TTG 1008 
He He He Cys Asp Glu Cys His Ser Thr Asp Ser Thr Thr He Leu 
325 330 335 

GGC ATC GGC ACA GTC CTG GAC CAA GCG GAG ACG GCT GGA GCG CGG CTT 1056 
Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu 
340 345 350 

GTC GTG CTC GCC ACC GCT ACG CCT CCG GGA TCG GTC ACC GTG CCA CAC 1104 
Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His 
355 360 365 

CCA AAC ATC GAG GAG GTG GCC CTG TCT AAT ACT GGA GAG ATC CCC TTC 1152 
Pro Asn He Glu Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe 
370 375 380 

TAT GGC AAA GCC ATC CCC ATT GAA GCC ATC AGG GGG GGA AGG CAT CTC 1200 
Tyr Gly Lys Ala He Pro He Glu Ala lie Arg Gly Gly Arg His Leu 
385 390 395 400 

ATT TTC TGT CAT TCC AAG AAG AAG TGC GAC GAG CTC GCC GCA AAG CTG 1248 
He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu 
405 410 415 

TCA GGC CTC GGA ATC AAC GCT GTG GCG TAT TAC CGG GGG CTC GAT GTG 1296 
Ser Gly Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val 
420 425 430 

TCC GTC ATA CCA ACT ATC GGA GAC GTC GTT GTC GTG GCA ACA GAC GCT 1344 
Ser Val He Pro Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala 
435 440 445 

CTG ATG ACG GGC TAT ACG GGC GAC TTT GAC TCA GTG ATC GAC TGT AAC 1392 
Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn 
450 455 460 

ACA TGT GTC ACC CAG ACA GTC GAC TTC AGC TTG GAT CCC ACC TTC ACC 1440 
Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr 
465 470 475 480 

ATT GAG ACG ACG ACC GTG CCT CAA GAC GCA GTG TCG CGC TCG CAG CGG 1488 
lie Glu Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg 
485 490 495 

CGG GGT AGG ACT GGC AGG GGT AGG AGA GGC ATC TAC AGG TTT GTG ACT 1536 
Arg Gly Arg Thr Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr 
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500 



505 



510 



CCG GGA GAA CGG CCC TCG GGC ATG TTC GAT TCC TCG GTC CTG TGT GAG 
Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu 
515 520 525 



1584 



TGC TAT GAC GCG GGC TGT GCT TGG TAC GAG CTC ACC CCC GCC GAG ACC 
Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr 
530 535 540 



1632 



TCG GTT AGG TTG CGG GCC TAC CTG AAC ACA CCA GGG TTG CCC GTT TGC 
Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys 
545 550 555 560 



1680 



CAG GAC CAC CTG GAG TTC TGG GAG AGT GTC TTC ACA GGC CTC ACC CAT 
Gin Asp His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His 
565 570 575 



1728 



ATA GAT GCA CAC TTC TTG TCC CAG ACC AAG CAG GCA GGA GAC AAC TTC 
He Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe 
580 585 590 



1776 



CCC TAC CTG GTA GCA TAC CAA GCC ACG GTG TGC GCC AGG GCT CAG GCC 
Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala 
595 600 605 



1824 



CCA CCT CCA TCA TGG GAT CAA ATG TGG AAG TGT CTC ATA CGG CTG AAA 
Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys 
610 615 620 



1872 



CCT ACG CTG CAC GGG CCA ACA CCC TTG CTG TAC AGG CTG GGA GCC GTC 
Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val 
625 630 635 640 



1920 



CAA AAT GAG GTC ACC CTC ACC CAC CCC ATA ACC AAA TAC ATC ATG GCA 
Gin Asn Glu Val Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala 
645 650 655 



1968 



TGC ATG TCG GCT GAC CTG GAG GTC GTC ACT 
Cys Met Ser Ala Asp Leu Glu Val Val 
660 665 



1998 



(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1998 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



<ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..1997 
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(ix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION: 1..1995 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 

ATG GGC AGC AGC CAT CAT CAT CAT CAT CAC AGC AGC GGC CTG GTG CCG 48 
Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

CGC GGC AGC CAT ATG GGT TCT GTT GTT ATT GTT GGT AGA ATT ATT TTA 96 
Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

TCT GGT AGT GGT AGT ATC ACG GCC TAC TCC CAA CAG ACG CGG GGC CTA 144 
Ser Gly Ser Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

CTT GGT TGC AAG ATC ACT AGC CTT ACA GGC CGG GAC AAG AAC CAG GTC 192 
Leu Gly Cys Lys He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

GAG GGA GAG GTT CAG GTG GTT TCC ACC GCA ACA CAA TCC TTC CTG GCG 240 
Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

ACC TGC GTC AAC GGC GTG TGT TGG ACC GTT TAC CAT GGT GCT GGC TCA 288 
Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

AAG ACC TTA GCC GGC CCA AAG GGG CCA ATC ACC CAG ATG TAC ACT AAT 3 36 

Lys Thr Leu Ala Gly Pro Lys Gly Pro lie Thr Gin Met Tyr Thr Asn 
100 105 110 

GTG GAC CAG GAC CTC GTC GGC TGG CAG GCG CCC CCC GGG GCG CGT TCC 384 
Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

TTG ACA CCA TGC ACC TGT GGC AGC TCA GAC CTT TAC TTG GTC ACG AGA 432 
Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 

CAT GCT GAC GTC ATT CCG GTG CGC CGG CGG GGC GAC AGT AGG GGG AGC 480 
His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 

CTG CTC TCC CCC AGG CCT GTC TCC TAC TTG AAG GGC TCT TCG GGT GGT 528 
Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly 
165 170 175 

CCA CTG CTC TGC CCT TCG GGG CAC GCT GTG GGC ATC TTC CGG GCT GCC 57 6 

Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala 
180 185 190 

GTA TGC ACC CGG GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTA GAG 624 
Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
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195 200 205 

TCC ATG GAA ACT ACT ATG CGG TCT CCG GTC TTC ACG GAC AAC TCA TCC 672 
Ser Met Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser 
210 215 220 

CCC CCG GCC GTA CCG CAG TCA TTT CAA GTG GCC CAC CTA CAC GCT CCC 720 
Pro Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro 
225 230 235 240 

ACT GGC AGC GGC AAG AGT ACT AAA GTG CCG GCT GCA TAT GCA GCC CAA 768 
Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin 
245 250 255 

GGG TAC AAG GTG CTC GTC CTC AAT CCG. TCC GTT GCC GCT ACC TTA GGG 816 
Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly 
260 265 270 

TTT GGG GCG TAT ATG TCT AAG GCA CAC GGT ATT GAC CCC AAC ATC AGA 864 
Phe Gly Ala Tyr Met Ser Lys Ala His Gly lie Asp Pro Asn He Arg 
275 280 285 

ACT GGG GTA AGG ACC ATT ACC AC A GGC GCC CCC GTC AC A TAC TCT ACC 912 
Thr Gly Val Arg Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr 
290 295 300 

TAT GGC AAG TTT CTT GCC GAT GGT GGT TGC TCT GGG GGC GCT TAT GAC 960 
Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp 
305 310 315 320 

ATC ATA ATA TGT GAT GAG TGC CAT TCA ACT GAC TCG ACT ACA ATC TTG 1008 
He He He Cys Asp Glu Cys His Ser Thr Asp Ser Thr Thr He Leu 
325 330 335 

GGC ATC GGC ACA GTC CTG GAC CAA GCG GAG ACG GCT GGA GCG CGG CTT 1056 
Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu 
340 345 350 

GTC GTG CTC GCC ACC GCT ACG CCT CCG GGA TCG GTC ACC GTG CCA CAC 1104 
Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His 
355 360 365 

CCA AAC ATC GAG GAG GTG GCC CTG TCT AAT ACT GGA GAG ATC CCC TTC 1152 
Pro Asn He Glu Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe 
370 375 380 

TAT GGC AAA GCC ATC CCC ATT GAA GCC ATC AGG GGG GGA AGG CAT CTC 1200 
Tyr Gly Lys Ala He Pro He Glu Ala He Arg Gly Gly Arg His Leu 
385 390 395 400 

ATT TTC TGT CAT TCC AAG AAG AAG TGC GAC GAG CTC GCC GCA AAG CTG 1248 
He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu 
405 410 415 

TCA GGC CTC GGA ATC AAC GCT GTG GCG TAT TAC CGG GGG CTC GAT GTG 1296 
Ser Gly Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val 
420 425 430 
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TCC GTC ATA CCA ACT ATC GGA GAC GTC GTT GTC GTG GCA AC A GAC GCT 1344 
Ser Val He Pro Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala 
435 440 445 

CTG ATG ACG GGC TAT ACG GGC GAC TTT GAC TCA GTG ATC GAC TGT AAC 1392 
Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn 
450 455 460 

ACA TGT GTC ACC CAG AC A GTC GAC TTC AGC TTG GAT CCC ACC TTC ACC 1440 
Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr 
465 470 475 480 

ATT GAG ACG ACG ACC GTG CCT CAA GAC GCA GTG TCG CGC TCG CAG CGG 1488 
He Glu Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg 
485 490 495 

CGG GGT AGG ACT GGC AGG GGT AGG AGA GGC ATC TAC AGG TTT GTG ACT 1536 
Arg Gly Arg Thr Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr 
500 505 510 

CCG GGA GAA CGG CCC TCG GGC ATG TTC GAT TCC TCG GTC CTG TGT GAG 1584 
Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu 
515 520 525 

TGC TAT GAC GCG GGC TGT GCT TGG TAC GAG CTC ACC CCC GCC GAG ACC 1632 
Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr 
530 535 540 

TCG GTT AGG TTG CGG GCC TAC CTG AAC ACA CCA GGG TTG CCC GTT TGC 1680 
Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys 
545 550 555 560 

CAG GAC CAC CTG GAG TTC TGG GAG AGT GTC TTC ACA GGC CTC ACC CAT 1728 
Gin Asp His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His 
565 570 575 

ATA GAT GCA CAC TTC TTG TCC CAG ACC AAG CAG GCA GGA GAC AAC TTC 1776 
He Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe 
580 585 590 

CCC TAC CTG GTA GCA TAC CAA GCC ACG GTG TGC GCC AGG GCT CAG GCC 1824 
Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala 
595 600 605 

CCA CCT CCA TCA TGG GAT CAA ATG TGG AAG TGT CTC ATA CGG CTG AAA 1872 
Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys 
610 615 620 

CCT ACG CTG CAC GGG CCA ACA CCC TTG CTG TAC AGG CTG GGA GCC GTC 1920 
Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val 
625 630 635 640 

CAA AAT GAG GTC ACC CTC ACC CAC CCC ATA ACC AAA TAC ATC ATG GCA 1968 
Gin Asn Glu Val Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala 
645 650 655 

TGC ATG TCG GCT GAC CTG GAG GTC GTC ACT 1998 
Cys Met Ser Ala Asp Leu Glu Val Val 
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(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1998 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..1995 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 

ATG GGC AGC AGC CAT CAT CAT CAT CAT CAC AGC AGC GGC CTG GTG CCG 48 
Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

CGC GGC AGC CAT ATG GGT TCT GTT GTT ATT GTT GGT AGA ATT ATT TTA 96 
Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

TCT GGT AGT GGT AGT ATC ACG GCC TAC TCC CAA CAG ACG CGG GGC CTA 144 
Ser Gly Ser Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

CTT GGT TGC ATC AAG ACT AGC CTT AC A GGC CGG GAC AAG AAC CAG GTC 192 
Leu Gly Cys He Lys Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

GAG GGA GAG GTT CAG GTG GTT TCC ACC GCA ACA CAA TCC TTC CTG GCG 240 
Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

ACC TGC GTC AAC GGC GTG TGT TGG ACC GTT TAC CAT GGT GCT GGC TCA 288 
Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

AAG ACC TTA GCC GGC CCA AAG GGG CCA ATC ACC CAG ATG TAC ACT AAT 336 
Lys Thr Leu Ala Gly Pro Lys Gly Pro lie Thr Gin Met Tyr Thr Asn 
100 105 HO 

GTG GAC CAG GAC CTC GTC GGC TGG CAG GCG CCC CCC GGG GCG CGt TCC 384 
Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

TTG ACA CCA TGC ACC TGT GGC AGC TCA GAC CTT TAC TTG GTC ACG AGA 432 
Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 
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CAT GCT GAC GTC ATT CCG GTG CGC CGG CGG GGC GAC AGT AGG GGG AGC 480 
His Ala Asp Val lie Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 

CTG CTC TCC CCC AGG CCT GTC TCC TAC TTG AAG GGC TCT TCG GGT GGT 528 
Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly 
165 170 175 

CCA CTG CTC TGC CCT TCG GGG CAC GCT GTG GGC ATC TTC CGG GCT GCC 576 
Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala 
180 185 190 

GTA TGC ACC CGG GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTA GAG 624 
Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
195 200 205 

TCC ATG GAA ACT ACT ATG CGG TCT CCG GTC TTC ACG GAC AAC TCA TCC 672 
Ser Met Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser 
210 215 220 

CCC CCG GCC GTA CCG CAG TCA TTT CAA GTG GCC CAC CTA CAC GCT CCC 720 
Pro Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro 
225 230 235 240 

ACT GGC AGC GGC AAG AGT ACT AAA GTG CCG GCT GCA TAT GCA GCC CAA 768 
Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin 
245 250 255 

GGG TAC AAG GTG CTC GTC CTC AAT CCG TCC GTT GCC GCT ACC TTA GGG 816 
Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly 
260 265 270 

TTT GGG GCG TAT ATG TCT AAG GCA CAC GGT ATT GAC CCC AAC ATC AGA 864 
Phe Gly Ala Tyr Met Ser Lys Ala His Gly He Asp Pro Asn He Arg 
275 280 285 

ACT GGG GTA AGG ACC ATT ACC ACA GGC GCC CCC GTC ACA TAC TCT ACC 912 
Thr Gly Val Arg Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr 
290 295 300 

TAT GGC AAG TTT CTT GCC GAT GGT GGT TGC TCT GGG GGC GCT TAT GAC 960 
Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp 
305 310 315 320 

ATC ATA ATA TGT GAT GAG TGC CAT TCA ACT GAC TCG ACT ACA ATC TTG 1008 
He He He Cys Asp Glu Cys His Ser Thr Asp Ser Thr Thr He Leu 
325 330 335 

GGC ATC GGC ACA GTC CTG GAC CAA GCG GAG ACG GCT GGA GCG CGG CTT 1056 
Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu 
340 345 350 

GTC GTG CTC GCC ACC GCT ACG CCT CCG GGA TCG GTC ACC GTG CCA CAC 1104 
Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His 
355 360 365 

CCA AAC ATC GAG GAG GTG GCC CTG TCT AAT ACT GGA GAG ATC CCC TTC 1152 
Pro Asn He Glu Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe 
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370 375 380 

TAT GGC AAA GCC ATC CCC ATT GAA GCC ATC AGG GGG GGA AGG CAT CTC 1200 
Tyr Gly Lys Ala lie Pro He Glu Ala He Arg Gly Gly Arg His Leu 
385 390 395 400 

ATT TTC TGT CAT TCC AAG AAG AAG TGC GAC GAG CTC GCC GCA .AAG CTG 1248 
He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu 
405 410 415 

TCA GGC CTC GGA ATC AAC GCT GTG GCG TAT TAC CGG GGG CTC GAT GTG 1296 
Ser Gly Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val 
420 425 430 

TCC GTC ATA CCA ACT ATC GGA GAC GTC GTT GTC GTG GCA ACA GAC GCT 1344 
Ser Val He Pro Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala 
435 440 445 

CTG ATG ACG GGC TAT ACG GGC GAC TTT GAC TCA GTG ATC GAC TGT AAC 1392 
Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn 
450 455 460 

ACA TGT GTC ACC CAG ACA GTC GAC TTC AGC TTG GAT CCC ACC TTC ACC 1440 
Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr 
465 470 475 480 

ATT GAG ACG ACG ACC GTG CCT CAA GAC GCA GTG TCG CGC TCG CAG CGG 1488 
He Glu Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg 
485 490 495 

CGG GGT AGG ACT GGC AGG GGT AGG AGA GGC ATC TAC AGG TTT GTG ACT 1536 
Arg Gly Arg Thr Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr 
500 505 510 

CCG GGA GAA CGG CCC TCG GGC ATG TTC GAT TCC TCG GTC CTG TGT GAG 1584 
Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu 
515 520 525 

TGC TAT GAC GCG GGC TGT GCT TGG TAC GAG CTC ACC CCC GCC GAG ACC 1632 
Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr 
530 535 540 

TCG GTT AGG TTG CGG GCC TAC CTG AAC ACA CCA GGG TTG CCC GTT TGC 1680 
Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys 
545 550 555 560 

CAG GAC CAC CTG GAG TTC TGG GAG AGT GTC TTC ACA GGC CTC ACC CAT 1728 
Gin Asp His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His 
565 570 575 

ATA GAT GCA CAC TTC TTG TCC CAG ACC AAG CAG GCA GGA GAC AAC TTC 1776 
He Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe 
580 585 590 

CCC TAC CTG GTA GCA TAC CAA GCC ACG GTG TGC GCC AGG GCT CAG GCC 1824 
Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala 
595 600 605 
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CCA CCT CCA TCA TGG GAT CAA ATG TGG 
Pro Pro Pro Ser Trp Asp Gin Met Trp 
610 615 

CCT ACG CTG CAC GGG CCA ACA CCC TTG 
Pro Thr Leu His Gly Pro Thr Pro Leu 
625 630 

CAA AAT GAG GTC ACC CTC ACC CAC CCC 

Gin Asn Glu Val Thr Leu Thr His Pro 
645 

TGC ATG TCG GCT GAC CTG GAG GTC GTC 

Cys Met Ser Ala Asp Leu Glu Val Val 

660 665 



AAG TGT CTC ATA CGG CTG AAA 1872 
Lys Cys Leu lie Arg Leu Lys 
620 

CTG TAC AGG CTG GGA GCC GTC 1920 
Leu Tyr Arg Leu Gly Ala Val 
635 640 

ATA ACC AAA TAC ATC ATG GCA 1968 
lie Thr Lys Tyr He Met Ala 
650 655 

ACT 1998 



(2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1998 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



<ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..1995 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

ATG GGC AGC AGC CAT CAT CAT CAT CAT CAC AGC AGC GGC CTG GTG CCG 48 
Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

CGC GGC AGC CAT ATG GGT TCT GTT GTT ATT GTT GGT AGA ATT ATT TTA 96 
Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

TCT GGT AGT GGT AGT ATC ACG GCC TAC TCC CAA CAG ACG CGG GGC CTA 144 
Ser Gly Ser Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

CTT GGT TGC AAG AAG ACT AGC CTT ACA GGC CGG GAC AAG AAC CAG GTC 192 
Leu Gly Cys Lys Lys Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

GAG GGA GAG GTT CAG GTG GTT TCC ACC GCA ACA CAA TCC TTC CTG GCG 240 
Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

ACC TGC GTC AAC GGC GTG TGT TGG ACC GTT TAC CAT GGT GCT GGC TCA 2 88 

Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 



SUBSTITUTE SHEET ( rule 26 ) 



WO 99/28482 



95 



PCT/US98/24528 



85 90 yb 

AAG ACC TTA GCC GGC CCA AAG GGG CCA ATC ACC CAG ATG TAC ACT AAT 336 
Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn 
100 105 110 

GTG GAC CAG GAC CTC GTC GGC TGG CAG GCG CCC CCC GGG GCG CGT TCC 3 84 

Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

TTG AC A CCA TGC ACC TGT GGC AGC TCA GAC CTT TAC TTG GTC ACG AGA 432 
Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 

CAT GCT GAC GTC ATT CCG GTG CGC CGG CGG GGC GAC AGT AGG GGG AGC 480 
His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 

CTG CTC TCC CCC AGG CCT GTC TCC TAC TTG AAG GGC TCT TCG GGT GGT 528 
Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly 
165 170 175 

CCA CTG CTC TGC CCT TCG GGG CAC GCT GTG GGC ATC TTC CGG GCT GCC 576 
Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala 
180 185 190 

GTA TGC ACC CGG GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTA GAG 624 
Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
195 200 205 

TCC ATG GAA ACT ACT ATG CGG TCT CCG GTC TTC ACG GAC AAC TCA TCC 672 
Ser Met Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser 
210 215 220 

CCC CCG GCC GTA CCG CAG TCA TTT CAA GTG GCC CAC CTA CAC GCT CCC 720 
Pro Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro 
225 230 235 240 

ACT GGC AGC GGC AAG AGT ACT AAA GTG CCG GCT GCA TAT GCA GCC CAA 768 
Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin 
245 250 255 

GGG TAC AAG GTG CTC GTC CTC AAT CCG TCC GTT GCC GCT ACC TTA GGG 816 
Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly 
260 265 270 

TTT GGG GCG TAT ATG TCT AAG GCA CAC GGT ATT GAC CCC AAC ATC AGA 864 
Phe Gly Ala Tyr Met Ser Lys Ala His Gly He Asp Pro Asn He Arg 
275 280 285 

ACT GGG GTA AGG ACC ATT ACC ACA GGC GCC CCC GTC ACA TAC TCT ACC 912 
Thr Gly Val Arg Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr 
290 295 300 

TAT GGC AAG TTT CTT GCC GAT GGT GGT TGC TCT GGG GGC GCT TAT GAC 960 
Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp 
305 310 315 320 
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ATC ATA ATA TGT GAT GAG TGC CAT TCA ACT GAC TCG ACT ACA ATC TTG 1008 
He He He Cys Asp Glu Cys His Ser Thr Asp Ser Thr Thr He Leu 
325 330 335 

GGC ATC GGC ACA GTC CTG GAC CAA GCG GAG ACG GCT GGA GCG CGG CTT 1056 
Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu 
340 345 350 

GTC GTG CTC GCC ACC GCT ACG CCT CCG GGA TCG GTC ACC GTG CCA CAC 1104 
Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His 
355 360 365 

CCA AAC ATC GAG GAG GTG GCC CTG TCT AAT ACT GGA GAG ATC CCC TTC 1152 
Pro Asn He Glu Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe 
370 375 380 

TAT GGC AAA GCC ATC CCC ATT GAA GCC ATC AGG GGG GGA AGG CAT CTC 1200 
Tyr Gly Lys Ala He Pro He Glu Ala He Arg Gly Gly Arg His Leu 
385 390 395 400 

ATT TTC TGT CAT TCC AAG AAG AAG TGC GAC GAG CTC GCC GCA AAG CTG 1248 
He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu 
405 410 415 

TCA GGC CTC GGA ATC AAC GCT GTG GCG TAT TAC CGG GGG CTC GAT GTG 1296 
Ser Gly Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val 
420 425 430 

TCC GTC ATA CCA ACT ATC GGA GAC GTC GTT GTC GTG GCA ACA GAC GCT 1344 
Ser Val He Pro Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala 
435 440 445 

CTG ATG ACG GGC TAT ACG GGC GAC TTT GAC TCA GTG ATC GAC TGT AAC 1392 
Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn 
450 455 460 

ACA TGT GTC ACC CAG ACA GTC GAC TTC AGC TTG GAT CCC ACC TTC ACC 1440 
Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr 
465 470 475 480 

ATT GAG ACG ACG ACC GTG CCT CAA GAC GCA GTG TCG CGC TCG CAG CGG 1488 
He Glu Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg 
485 490 495 

CGG GGT AGG ACT GGC AGG GGT AGG AGA GGC ATC TAC AGG TTT GTG ACT 1536 
Arg Gly Arg Thr Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr 
500 505 510 

CCG GGA GAA CGG CCC TCG GGC ATG TTC GAT TCC TCG GTC CTG TGT GAG 1584 
Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu 
515 520 525 

TGC TAT GAC GCG GGC TGT GCT TGG TAC GAG CTC ACC CCC GCC GAG ACC 1632 
Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr 
530 535 540 

TCG GTT AGG TTG CGG GCC TAC CTG AAC ACA CCA GGG TTG CCC GTT TGC 1680 
Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys 
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CAG GAC CAC CTG GAG TTC TGG GAG AGT GTC TTC AC A GGC CTC ACC CAT 1728 
Gin Asp His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His 
565 570 575 

ATA GAT GCA CAC TTC TTG TCC CAG ACC AAG CAG GCA GGA GAC AAC TTC 1776 
lie Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe 
580 585 590 

CCC TAC CTG GTA GCA TAC CAA GCC ACG GTG TGC GCC AGG GCT CAG GCC 1824 
Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala 
595 600 605 

CCA CCT CCA TCA TGG GAT CAA ATG TGG AAG TGT CTC ATA CGG CTG AAA 1872 
Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys 
610 615 620 

CCT ACG CTG CAC GGG CCA ACA CCC TTG CTG TAC AGG CTG GGA GCC GTC 1920 
Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val 
625 630 635 640 

CAA AAT GAG GTC ACC CTC ACC CAC CCC ATA ACC AAA TAC ATC ATG GCA 1968 
Gin Asn Glu Val Thr Leu Thr His. Pro He Thr Lys Tyr He Met Ala 
645 650 655 

TGC ATG TCG GCT GAC CTG GAG GTC GTC ACT 1998 
Cys Met Ser Ala Asp Leu Glu Val Val 
660 665 



(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1998 base pairs 
<B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..1995 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 

ATG GGC AGC AGC CAT CAT CAT CAT CAT CAC AGC AGC GGC CTG GTG CCG 48 
Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

CGC GGC AGC CAT ATG GGT TCT GTT GTT ATT GTT GGT AGA ATT ATT TTA 96 
Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

TCT GGT AGT GGT AGT ATC ACG GCC TAC TCC CAA CAG ACG CGG GGC CTA 144 
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Ser Gly Ser Gly Ser lie Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

CTT GGT TGC ATC ATC ACT AGC CTT AC A GGC CGG GAC AAG AAC CAG GTC 192 
Leu Gly Cys He He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

GAG GGA GAG GTT CAG GTG GTT TCC ACC GCA AC A CAA TCC TTC CTG GCG 240 
Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

ACC TGC GTC AAC GGC GTG TGT TGG ACC GTT TAC CAT GGT GCT GGC TCA 288 
Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

AAG ACC TTA GCC GGC CCA AAG GGG CCA ATC ACC CAG ATG TAC ACT AAT 336 
Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn 
100 105 110 

GTG GAC CAG GAC CTC GTC GGC TGG CAG GCG CCC CCC GGG GCG CGT TCC 384 
Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

TTG ACA CCA TGC ACC TGT GGC AGC TCA GAC CTT TAC TTG GTC ACG AGA 432 
Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 

CAT GCT GAC GTC ATT CCG GTG CGC CGG CGG GGC GAC AGT AGG GGG AGC 480 
His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 

CTG CTC TCC CCC AGG CCT GTC TCC TAC TTG AAG GGC TCT GCT GGT GGT 528 
Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ala Gly Gly 
165 170 175 

CCA CTG CTC TGC CCT TCG GGG CAC GCT GTG GGC ATC TTC CGG GCT GCC 57 6 

Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala 
180 185 190 

GTA TGC ACC CGG GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTA GAG 624 
Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
195 200 205 

TCC ATG GAA ACT ACT ATG CGG TCT CCG GTC TTC ACG GAC AAC TCA TCC 672 
Ser Met Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser 
210 215 220 

CCC CCG GCC GTA CCG CAG TCA TTT CAA GTG GCC CAC CTA CAC GCT CCC 720 
Pro Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro 
225 230 235 240 

ACT GGC AGC GGC AAG AGT ACT AAA GTG CCG GCT GCA TAT GCA GCC CAA 768 
Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin 
245 250 255 

GGG TAC AAG GTG CTC GTC CTC AAT CCG TCC GTT GCC GCT ACC TTA GGG 816 
Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly 
260 265 270 
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TTT GGG GCG TAT ATG TCT AAG GCA CAC GGT ATT GAC CCC AAC ATC AGA 864 
Phe Gly Ala Tyr Met Ser Lys Ala His Gly lie Asp Pro Asn lie Arg 
275 280 285 

ACT GGG GTA AGG ACC ATT ACC ACA GGC GCC CCC GTC ACA TAC TCT ACC 912 
Thr Gly Val Arg Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr 
290 295 300 

TAT GGC AAG TTT CTT GCC GAT GGT GGT TGC TCT GGG GGC GCT TAT GAC 960 
Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp 
305 310 315 320 

ATC ATA ATA TGT GAT GAG TGC CAT TCA ACT GAC TCG ACT ACA ATC TTG 1008 
He He He Cys Asp. Glu Cys His Ser Thr Asp Ser Thr Thr He Leu 
325 330 335 

GGC ATC GGC ACA GTC CTG GAC CAA GCG GAG ACG GCT GGA GCG CGG CTT 1056 
Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu 
340 345 350 

GTC GTG CTC GCC ACC GCT ACG CCT CCG GGA TCG GTC ACC GTG CCA CAC 1104 
Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His 
355 360 365 

CCA AAC ATC GAG GAG GTG GCC CTG TCT AAT ACT GGA GAG ATC CCC TTC 1152 
Pro Asn He Glu Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe 
370 375 380 

TAT GGC AAA GCC ATC CCC ATT GAA GCC ATC AGG GGG GGA AGG CAT CTC 1200 
Tyr Gly Lys Ala He Pro He Glu Ala He Arg Gly Gly Arg His Leu 
385 390 395 400 

ATT TTC TGT CAT TCC AAG AAG AAG TGC GAC GAG CTC GCC GCA AAG CTG 1248 
He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu 
405 410 415 

TCA GGC CTC GGA ATC AAC GCT GTG GCG TAT TAC CGG GGG CTC GAT GTG 1296 
Ser Gly Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val 
420 425 430 

TCC GTC ATA CCA ACT ATC GGA GAC GTC GTT GTC GTG GCA ACA GAC GCT 1344 
Ser Val He Pro Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala 
435 440 445 

CTG ATG ACG GGC TAT ACG GGC GAC TTT GAC TCA GTG ATC GAC TGT AAC 1392 
Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn 
450 455 460 

ACA TGT GTC ACC CAG ACA GTC GAC TTC AGC TTG GAT CCC ACC TTC ACC 1440 
Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr 
465 470 475 480 

ATT GAG ACG ACG ACC GTG CCT CAA GAC GCA GTG TCG CGC TCG CAG CGG 1488 
He Glu Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg 
485 490 495 

CGG GGT AGG ACT GGC AGG GGT AGG AGA GGC ATC TAC AGG TTT GTG ACT 1536 
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Arg Gly Arg Thr Gly Arg Gly Arg Arg Gly lie Tyr Arg Phe Val Thr 
500 505 510 

CCG GGA GAA CGG CCC TCG GGC ATG TTC GAT TCC TCG GTC CTG TGT GAG 1584 
Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu 
515 520 525 

TGC TAT GAC GCG GGC TGT GCT TGG TAC GAG CTC ACC CCC GCC GAG ACC 1632 
Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr 
530 535 540 

TCG GTT AGG TTG CGG GCC TAC CTG AAC AC A CCA GGG TTG CCC GTT TGC 1680 
Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys 
545 550 555 560 

CAG GAC CAC CTG GAG TTC TGG GAG AGT GTC TTC ACA GGC CTC ACC CAT 1728 
Gin Asp His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His 
565 570 575 

ATA GAT GCA CAC TTC TTG TCC CAG ACC AAG CAG GCA GGA GAC AAC TTC 1776 
He Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe 
580 585 590 

CCC TAC CTG GTA GCA TAC CAA GCC ACG GTG TGC GCC AGG GCT CAG GCC 1824 
Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala 
595 600 605 

CCA CCT CCA TCA TGG GAT CAA ATG TGG AAG TGT CTC ATA CGG CTG AAA 1872 
Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys 
610 615 620 

CCT ACG CTG CAC GGG CCA ACA CCC TTG CTG TAC AGG CTG GGA GCC GTC 1920 
Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val 
625 630 635 640 

CAA AAT GAG GTC ACC CTC ACC CAC CCC ATA ACC AAA TAC ATC ATG GCA 1968 
Gin Asn Glu Val Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala 
645 650 655 

TGC ATG TCG GCT GAC CTG GAG GTC GTC ACT 1998 
Cys Met Ser Ala Asp Leu Glu Val Val 
660 665 



(2) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1998 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION: 1..1997 



SUBSTITUTE SHEET ( rule 26 ) 



WO 99/28482 



101 



PCT/US98/24528 



<ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..1995 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

ATG GGC AGC AGC CAT CAT CAT CAT CAT CAC AGC AGC GGC CTG GTG CCG 48 
Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
1 5 10 15 

CGC GGC AGC CAT ATG GGT TCT GTT GTT ATT GTT GGT AGA ATT ATT TTA 96 
Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

TCT GGT AGT GGT AGT ATC ACG GCC TAC TCC CAA CAG ACG CGG GGC CTA 144 
Ser Gly Ser Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

CTT GGT TGC AAG ATC ACT AGC CTT ACA GGC CGG GAC AAG AAC CAG GTC 192 
Leu Gly Cys Lys He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

GAG GGA GAG GTT CAG GTG GTT TCC ACC GCA ACA CAA TCC TTC CTG GCG 240 
Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

ACC TGC GTC AAC GGC GTG TGT TGG ACC GTT TAC CAT GGT GCT GGC TCA 288 
Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

AAG ACC TTA GCC GGC CCA AAG GGG CCA ATC ACC CAG ATG TAC ACT AAT 336 
Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn 
100 105 110 

GTG GAC CAG GAC CTC GTC GGC TGG CAG GCG CCC CCC GGG GCG CGT TCC 384 
Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

TTG ACA CCA TGC ACC TGT GGC AGC TCA GAC CTT TAC TTG GTC ACG AGA 432 
Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 

CAT GCT GAC GTC ATT CCG GTG CGC CGG CGG GGC GAC AGT AGG GGG AGC 480 
His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 

CTG CTC TCC CCC AGG CCT GTC TCC TAC TTG AAG GGC TCT GCT GGT GGT 528 
Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ala Gly Gly 
165 170 175 

CCA CTG CTC TGC CCT TCG GGG CAC GCT GTG GGC ATC TTC CGG GCT GCC 57 6 

Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala 
180 185 190 

GTA TGC ACC CGG GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTA GAG 624 
Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
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195 200 205 

TCC ATG GAA ACT ACT ATG CGG TCT CCG GTC TTC ACG GAC AAC TCA TCC 672 
Ser Met Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser 
210 215 220 

CCC CCG GCC GTA CCG CAG TCA TTT CAA GTG GCC CAC CTA CAC GCT CCC 720 
Pro Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro 
225 230 235 240 

ACT GGC AGC GGC AAG AGT ACT AAA GTG CCG GCT GCA TAT GCA GCC CAA 768 
Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin 
245 250 255 

GGG TAC AAG GTG CTC GTC CTC AAT CCG TCC GTT GCC GCT ACC TTA GGG 816 
Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly 
260 265 270 

TTT GGG GCG TAT ATG TCT AAG GCA CAC GGT ATT GAC CCC AAC ATC AGA 864 
Phe Gly Ala Tyr Met Ser Lys Ala His Gly He Asp Pro Asn He Arg 
275 280 285 

ACT GGG GTA AGG ACC ATT ACC AC A GGC GCC CCC GTC ACA TAC TCT ACC 912 
Thr Gly Val Arg Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr 
290 295 300 

TAT GGC AAG TTT CTT GCC GAT GGT GGT TGC TCT GGG GGC GCT TAT GAC 960 
Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp 
305 310 315 320 

ATC ATA ATA TGT GAT GAG TGC CAT TCA ACT GAC TCG ACT ACA ATC TTG 1008 
He He He Cys Asp Glu Cys His Ser Thr Asp Ser Thr Thr He Leu 
325 330 335 

GGC ATC GGC ACA GTC CTG GAC CAA GCG GAG ACG GCT GGA GCG CGG CTT 1056 
Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu 
340 345 350 

GTC GTG CTC GCC ACC GCT ACG CCT CCG GGA TCG GTC ACC GTG CCA CAC 1104 
Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His 
355 360 365 

CCA AAC ATC GAG GAG GTG GCC CTG TCT AAT ACT GGA GAG ATC CCC TTC 1152 
Pro Asn He Glu Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe 
370 375 380 

TAT GGC AAA GCC ATC CCC ATT GAA GCC ATC AGG GGG GGA AGG CAT CTC 1200 
Tyr Gly Lys Ala He Pro He Glu Ala He Arg Gly Gly Arg His Leu 
385 390 395 400 

ATT TTC TGT CAT TCC AAG AAG AAG TGC GAC GAG CTC GCC GCA AAG CTG 1248 
He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu 
405 410 415 

TCA GGC CTC GGA ATC AAC GCT GTG GCG TAT TAC CGG GGG CTC GAT GTG 1296 
Ser Gly Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val 
420 425 430 
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TCC GTC ATA CCA ACT ATC GGA GAC GTC GTT GTC GTG GCA ACA GAC GCT 
Ser Val lie Pro Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala 
435 440 445 



1344 



CTG ATG ACG GGC TAT ACG GGC GAC TTT GAC TCA GTG ATC GAC TGT AAC 
Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn 
450 455 460 



1392 



ACA TGT GTC ACC CAG ACA GTC GAC TTC AGC TTG GAT CCC ACC TTC ACC 
Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr 
465 470 475 480 



1440 



ATT GAG ACG ACG ACC GTG CCT CAA GAC GCA GTG TCG CGC TCG CAG CGG 
He Glu Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg 
485 490 495 



1488 



CGG GGT AGG ACT GGC AGG GGT AGG AGA GGC ATC TAC AGG TTT GTG ACT 
Arg Gly Arg Thr Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr 
500 505 510 



1536 



CCG GGA GAA CGG CCC TCG GGC ATG TTC GAT TCC TCG GTC CTG TGT GAG 
Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu 
515 520 525 



1584 



TGC TAT GAC GCG GGC TGT GCT TGG TAC GAG CTC ACC CCC GCC GAG ACC 
Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr 
530 535 540 



1632 



TCG GTT AGG TTG CGG GCC TAC CTG AAC ACA CCA GGG TTG CCC GTT TGC 
Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys 
545 550 555 560 



1680 



CAG GAC CAC CTG GAG TTC TGG GAG AGT GTC TTC ACA GGC CTC ACC CAT 
Gin Asp His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His 
565 570 575 



1728 



ATA GAT GCA CAC TTC TTG TCC CAG ACC AAG CAG GCA GGA GAC AAC TTC 
He Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe 
580 585 590 



1776 



CCC TAC CTG GTA GCA TAC CAA GCC ACG GTG TGC GCC AGG GCT CAG GCC 
Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala 
595 600 605 



1824 



CCA CCT CCA TCA TGG GAT CAA ATG TGG AAG TGT CTC ATA CGG CTG AAA 
Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys 
610 615 620 



1872 



CCT ACG CTG CAC GGG CCA ACA CCC TTG CTG TAC AGG CTG GGA GCC GTC 
Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val 
625 630 635 640 



1920 



CAA AAT GAG GTC ACC CTC ACC CAC CCC ATA ACC AAA TAC ATC ATG GCA 
Gin Asn Glu Val Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala 
645 650 655 



1968 



TGC ATG TCG GCT GAC CTG GAG GTC GTC ACT 
Cys Met Ser Ala Asp Leu Glu Val Val 



1998 
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(2) INFORMATION FOR SEQ ID NO: 108 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1998 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1. .1995 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:108 

ATG GGC AGC AGC CAT CAT CAT CAT CAT CAC AGC AGC GGC CTG GTG CCG 48 
Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

CGC GGC AGC CAT ATG GGT TCT GTT GTT ATT GTT GGT AGA ATT ATT TTA 96 
Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

TCT GGT AGT GGT AGT ATC ACG GCC TAC TCC CAA CAG ACG CGG GGC CTA 144 
Ser Gly Ser Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

CTT GGT TGC ATC AAG ACT AGC CTT ACA GGC CGG GAC AAG AAC CAG GTC 192 
Leu Gly Cys He Lys Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

GAG GGA GAG GTT CAG GTG GTT TCC ACC GCA ACA CAA TCC TTC CTG GCG 240 
Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
65 70 75 80 

ACC TGC GTC AAC GGC GTG TGT TGG ACC GTT TAC CAT GGT GCT GGC TCA 288 
Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

AAG ACC TTA GCC GGC CCA AAG GGG CCA ATC ACC CAG ATG TAC ACT AAT 336 
Lys Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn 
100 105 110 

GTG GAC CAG GAC CTC GTC GGC TGG CAG GCG CCC CCC GGG GCG CGT TCC 384 
Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

TTG ACA CCA TGC ACC TGT GGC AGC TCA GAC CTT TAC TTG GTC ACG AGA 432 
Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 
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CAT GCT GAC GTC ATT CCG GTG CGC CGG CGG GGC GAC AGT AGG GGG AGC 480 
His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 

CTG CTC TCC CCC AGG CCT GTC TCC TAC TTG AAG GGC TCT GCT GGT GGT 528 
Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ala Gly Gly 
165 170 175 

CCA CTG CTC TGC CCT TCG GGG CAC GCT GTG GGC ATC TTC CGG GCT GCC 576 
Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala 
180 185 190 

GTA TGC ACC CGG GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTA GAG 624 
Val Cys Thr Arg Gly Val Ala Lys Ala. Val Asp Phe Val Pro Val Glu 
195 200 205 

TCC ATG GAA ACT ACT ATG CGG TCT CCG GTC TTC ACG GAC AAC TCA TCC 672 
Ser Met Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser 
210 215 220 

CCC CCG GCC GTA CCG CAG TCA TTT CAA GTG GCC CAC CTA CAC GCT CCC 720 
Pro Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro 
225 230 235 240 

ACT GGC AGC GGC AAG AGT ACT AAA GTG CCG GCT GCA TAT GCA GCC CAA 768 
Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin 
245 250 255 

GGG TAC AAG GTG CTC GTC CTC AAT CCG TCC GTT GCC GCT ACC TTA GGG 816 
Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly 
260 265 270 

TTT GGG GCG TAT ATG TCT AAG GCA CAC GGT ATT GAC CCC AAC ATC AGA 864 
Phe Gly Ala Tyr Met Ser Lys Ala His Gly He Asp Pro Asn He Arg 
275 280 285 

ACT GGG GTA AGG ACC ATT ACC ACA GGC GCC CCC GTC ACA TAC TCT ACC 912 
Thr Gly Val Arg Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr 
290 295 300 

TAT GGC AAG TTT CTT GCC GAT GGT GGT TGC TCT GGG GGC GCT TAT GAC 960 
Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp 
305 310 315 320 

ATC ATA ATA TGT GAT GAG TGC CAT TCA ACT GAC TCG ACT ACA ATC TTG 1008 
He He He Cys Asp Glu Cys His Ser Thr Asp Ser Thr Thr He Leu 
325 330 335 

GGC ATC GGC ACA GTC CTG GAC CAA GCG GAG ACG GCT GGA GCG CGG CTT 1056 
Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu 
340 345 350 

GTC GTG CTC GCC ACC GCT ACG CCT CCG GGA TCG GTC ACC GTG CCA CAC 1104 
Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His 
355 360 365 

CCA AAC ATC GAG GAG GTG GCC CTG TCT AAT ACT GGA GAG ATC CCC TTC 1152 
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Pro Asn lie Glu Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe 
370 375 380 

TAT GGC AAA GCC ATC CCC ATT GAA GCC ATC AGG GGG GGA AGG CAT CTC 1200 
Tyr Gly Lys Ala He Pro He Glu Ala He Arg Gly Gly Arg His Leu 
385 390 395 400 

ATT TTC TGT CAT TCC AAG AAG AAG TGC GAC GAG CTC GCC GCA AAG CTG 1248 
He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu 
405 410 415 

TCA GGC CTC GGA ATC AAC GCT GTG GCG TAT TAC CGG GGG CTC GAT GTG 1296 
Ser Gly Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val 
420 425 430 

TCC GTC ATA CCA ACT ATC GGA GAC GTC GTT GTC GTG GCA ACA GAC GCT 1344 
Ser Val He Pro Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala. 
435 440 445 

CTG ATG ACG GGC TAT ACG GGC GAC TTT GAC TCA GTG ATC GAC TGT AAC 1392 
Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn 
450 455 460 

ACA TGT GTC ACC CAG ACA GTC GAC TTC AGC TTG GAT CCC ACC TTC ACC 1440 
Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr 
465 470 475 480 

ATT GAG ACG ACG ACC GTG CCT CAA GAC GCA GTG TCG CGC TCG CAG CGG 1488 
He Glu Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg 
485 490 495 

CGG GGT AGG ACT GGC AGG GGT AGG AGA GGC ATC TAC AGG TTT GTG ACT 1536 
Arg Gly Arg Thr Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr 
500 505 510 

CCG GGA GAA CGG CCC TCG GGC ATG TTC GAT TCC TCG GTC CTG TGT GAG 1584 
Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu 
515 520 525 

TGC TAT GAC GCG GGC TGT GCT TGG TAC GAG CTC ACC CCC GCC GAG ACC 1632 
Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr 
530 535 540 

TCG GTT AGG TTG CGG GCC TAC CTG AAC ACA CCA GGG TTG CCC GTT TGC 1680 
Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys 
545 550 555 560 

CAG GAC CAC CTG GAG TTC TGG GAG AGT GTC TTC ACA GGC CTC ACC CAT 1728 
Gin Asp His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His 
565 570 575 

ATA GAT GCA CAC TTC TTG TCC CAG ACC AAG CAG GCA GGA GAC AAC TTC 1776 
He Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe 
580 585 590 

CCC TAC CTG GTA GCA TAC CAA GCC ACG GTG TGC GCC AGG GCT CAG GCC 1824 
Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala 
595 600 605 
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CCA CCT CCA TCA TGG GAT CAA ATG TGG AAG TGT CTC ATA CGG CTG AAA 1872 
Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys 
610 615 620 

CCT ACG CTG CAC GGG CCA AC A CCC TTG CTG TAC AGG CTG GGA GCC GTC 1920 
Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val 
625 630 635 640 

CAA AAT GAG GTC ACC CTC ACC CAC CCC ATA ACC AAA TAC ATC ATG GCA 1968 
Gin Asn Glu Val Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala 
645 650 655 

TGC ATG TCG GCT GAC CTG GAG GTC GTC ACT 1998 
Cys Met Ser Ala Asp Leu Glu Val Val 
660 665 



(2) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1998 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1. .1997 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..1995 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

ATG GGC AGC AGC CAT CAT CAT CAT CAT CAC AGC AGC GGC CTG GTG CCG 48 
Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

CGC GGC AGC CAT ATG GGT TCT GTT GTT ATT GTT GGT AGA ATT ATT TTA 96 
Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

TCT GGT AGT GGT AGT ATC ACG GCC TAC TCC CAA CAG ACG CGG GGC CTA 144 
Ser Gly Ser Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu 
35 40 45 

CTT GGT TGC AAG AAG ACT AGC CTT ACA GGC CGG GAC AAG AAC CAG GTC 192 
Leu Gly Cys Lys Lys Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
50 55 60 

GAG GGA GAG GTT CAG GTG GTT TCC ACC GCA ACA CAA TCC TTC CTG GCG 240 
Glu Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala 
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65 70 75 80 

ACC TGC GTC AAC GGC GTG TGT TGG ACC GTT TAC CAT GGT GCT GGC TCA 288 
Thr Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser 
85 90 95 

AAG ACC TTA GCC GGC CCA AAG GGG CCA ATC ACC CAG ATG TAC ACT AAT 336 
Lys Thr Leu Ala Gly Pro Lys Gly Pro lie Thr Gin Met Tyr Thr Asn 
100 105 110 

GTG GAC CAG GAC CTC GTC GGC TGG CAG GCG CCC CCC GGG GCG CGT TCC 3 84 

Val Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser 
115 120 125 

TTG AC A CCA TGC ACC TGT GGC AGC TCA GAC CTT TAC TTG GTC ACG AGA 432 
Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 
130 135 140 

CAT GCT GAC GTC ATT CCG GTG CGC CGG CGG GGC GAC AGT AGG GGG AGC 480 
His Ala Asp Val lie Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser 
145 150 155 160 

CTG CTC TCC CCC AGG CCT GTC TCC TAC TTG AAG GGC TCT GCT GGT GGT 528 
Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ala Gly Gly 
165 170 175 

CCA CTG CTC TGC CCT TCG GGG CAC GCT GTG GGC ATC TTC CGG GCT GCC 576 
Pro Leu Leu Cys Pro Ser Gly His Ala Val Gly lie Phe Arg Ala Ala 
180 185 190 

GTA TGC ACC CGG GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTA GAG 624 
Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu 
195 200 205 

TCC ATG GAA ACT ACT ATG CGG TCT CCG GTC TTC ACG GAC AAC TCA TCC 672 
Ser Met Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser 
210 215 220 

CCC CCG GCC GTA CCG CAG TCA TTT CAA GTG GCC CAC CTA CAC GCT CCC 720 
Pro Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro 
225 230 235 240 

ACT GGC AGC GGC AAG AGT ACT AAA GTG CCG GCT GCA TAT GCA GCC CAA 768 
Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin 
245 250 255 

GGG TAC AAG GTG CTC GTC CTC AAT CCG TCC GTT GCC GCT ACC TTA GGG 816 
Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly 
260 265 270 

TTT GGG GCG TAT ATG TCT AAG GCA CAC GGT ATT GAC CCC AAC ATC AGA 864 
Phe Gly Ala Tyr Met Ser Lys Ala His Gly lie Asp Pro Asn lie Arg 
275 280 285 

ACT GGG GTA AGG ACC ATT ACC ACA GGC GCC CCC GTC AC A TAC TCT ACC 912 
Thr Gly Val Arg Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr 
290 295 300 
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TAT GGC AAG TTT CTT GCC GAT GGT GGT TGC TCT GGG GGC GCT TAT GAC 960 
Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp 
305 310 315 320 

ATC ATA ATA TGT GAT GAG TGC CAT TCA ACT GAC TCG ACT ACA ATC TTG 1008 
lie lie lie Cys Asp Glu Cys His Ser Thr Asp Ser Thr Thr lie Leu 
325 330 335 

GGC ATC GGC ACA GTC CTG GAC CAA GCG GAG ACG GCT GGA GCG CGG CTT 1056 
Gly lie Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu 
340 345 350 

GTC GTG CTC GCC ACC GCT ACG CCT CCG GGA TCG GTC ACC GTG CCA CAC 1104 
Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His 
355 360 365 

CCA AAC ATC GAG GAG GTG GCC CTG TCT AAT ACT GGA GAG ATC CCC TTC 1152 
Pro Asn lie Glu Glu Val Ala Leu Ser Asn Thr Gly Glu lie Pro Phe 
370 375 380 

TAT GGC AAA GCC ATC CCC ATT GAA GCC ATC AGG GGG GGA AGG CAT CTC 1200 
Tyr Gly Lys Ala lie Pro lie Glu Ala lie Arg Gly Gly Arg His Leu 
385 390 395 400 

ATT TTC TGT CAT TCC AAG AAG AAG TGC GAC GAG CTC GCC GCA AAG CTG 1248 
lie Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu 
405 410 415 

TCA GGC CTC GGA ATC AAC GCT GTG GCG TAT TAC CGG GGG CTC GAT GTG 1296 
Ser Gly Leu Gly lie Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val 
420 425 430 

TCC GTC ATA CCA ACT ATC GGA GAC GTC GTT GTC GTG GCA ACA GAC GCT 1344 
Ser Val lie Pro Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala 
435 440 445 

CTG ATG ACG GGC TAT ACG GGC GAC TTT GAC TCA GTG ATC GAC TGT AAC 1392 
Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn 
450 455 460 

ACA TGT GTC ACC CAG ACA GTC GAC TTC AGC TTG GAT CCC ACC TTC ACC 1440 
Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr 
465 470 475 480 

ATT GAG ACG ACG ACC GTG CCT CAA GAC GCA GTG TCG CGC TCG CAG CGG 1488 
He Glu Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg 
485 490 495 

CGG GGT AGG ACT GGC AGG GGT AGG AGA GGC ATC TAC AGG TTT GTG ACT 1536 
Arg Gly Arg Thr Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr 
500 505 510 

CCG GGA GAA CGG CCC TCG GGC ATG TTC GAT TCC TCG GTC CTG TGT GAG 1584 
Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu 
515 520 525 

TGC TAT GAC GCG GGC TGT GCT TGG TAC GAG CTC ACC CCC GCC GAG ACC 1632 
Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr 
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530 535 540 

TCG GTT AGG TTG CGG GCC TAG CTG AAC ACA CCA GGG TTG CCC GTT TGC 1680 
Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys 
545 550 555 560 

CAG GAC CAC CTG GAG TTC TGG GAG AGT GTC TTC ACA GGC CTC ACC CAT 1728 
Gin Asp His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His 
565 570 575 

ATA GAT GCA CAC TTC TTG TCC CAG ACC AAG CAG GCA GGA GAC AAC TTC 1776 
lie Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe 
580 585 590 

CCC TAC CTG GTA GCA TAC CAA GCC ACG GTG TGC GCC AGG GCT CAG GCC 1824 
Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala 
595 600 605 

CCA CCT CCA TCA TGG GAT CAA ATG TGG AAG TGT CTC ATA CGG CTG AAA 1872 
Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys 
610 615 620 

CCT ACG CTG CAC GGG CCA ACA CCC TTG CTG TAC AGG CTG GGA GCC GTC 1920 
Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val 
625 630 635 640 

CAA AAT GAG GTC ACC CTC ACC CAC CCC ATA ACC AAA TAC ATC ATG GCA 1968 
Gin Asn Glu Val Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala 
645 650 655 

TGC ATG TCG GCT GAC CTG GAG GTC GTC ACT 1998 
Cys Met Ser Ala Asp Leu Glu Val Val 
660 665 



(2) INFORMATION FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2016 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1. .2013 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 

ATG GGC AGC AGC CAT CAT CAT CAT CAT CAC AGC AGC GGC CTG GTG CCG 48 
Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 
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CGC GGC AGC CAT ATG GCT TAC TCT CTG ACT ACG GGT TCT GTT GTT ATT 96 
Arg Gly Ser His Met Ala Tyr Ser Leu Thr Thr Gly Ser Val Val He 
20 25 30 

GTT GGT AGA ATT ATT TTA TCT GGT AGT GGT AGT ATC ACG GCC TAC TCC 144 
Val Gly Arg He He Leu Ser Gly Ser Gly Ser He Thr Ala Tyr Ser 
35 40 45 ' 

CAA CAG ACG CGG GGC CTA CTT GGT TGC ATC ATC ACT AGC CTT ACA GGC 192 
Gin Gin Thr Arg Gly Leu Leu Gly Cys He He Thr Ser Leu Thr Gly 
50 55 60 

CGG GAC AAG AAC CAG GTC GAG GGA GAG GTT CAG GTG GTT TCC ACC GCA 240 
Arg Asp Lys Asn Gin Val Glu Gly Glu Val Gin Val Val Ser Thr Ala 
65 .70 75 80 

ACA CAA TCC TTC CTG GCG ACC TGC GTC AAC GGC GTG TGT TGG ACC GTT 288 
Thr Gin Ser Phe Leu Ala Thr Cys Val Asn Gly Val Cys Trp Thr Val 
85 90 95 

TAC CAT GGT GCT GGC TCA AAG ACC TTA GCC GGC CCA AAG GGG CCA ATC 33 6 

Tyr His Gly Ala Gly Ser Lys Thr Leu Ala Gly Pro Lys Gly Pro He 
100 105 110 

ACC CAG ATG TAC ACT AAT GTG GAC CAG GAC CTC GTC GGC TGG CAG GCG 384 
Thr Gin Met Tyr Thr Asn Val Asp Gin Asp Leu Val Gly Trp Gin Ala 
115 120 125 

CCC CCC GGG GCG CGT TCC TTG ACA CCA TGC ACC TGT GGC AGC TCA GAC 432 
Pro Pro Gly Ala Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp 
130 135 140 

CTT TAC TTG GTC ACG AGA CAT GCT GAC GTC ATT CCG GTG CGC CGG CGG 480 
Leu Tyr Leu Val Thr Arg His Ala Asp Val He Pro Val Arg Arg Arg 
145 150 155 160 

GGC GAC AGT AGG GGG AGC CTG CTC TCC CCC AGG CCT GTC TCC TAC TTG 528 
Gly Asp Ser Arg Gly Ser Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu 
165 170 175 

AAG GGC TCT TCG GGT GGT CCA CTG CTC TGC CCT TCG GGG CAC GCT GTG 576 
Lys Gly Ser Ser Gly Gly Pro Leu Leu Cys Pro Ser Gly His Ala Val 
180 185 190 

GGC ATC TTC CGG GCT GCC GTA TGC ACC CGG GGG GTT GCG AAG GCG GTG 624 
Gly He Phe Arg Ala Ala Val Cys Thr Arg Gly Val Ala Lys Ala Val 
195 200 205 

GAC TTT GTG CCC GTA GAG TCC ATG GAA ACT ACT ATG CGG TCT CCG GTC 672 
Asp Phe Val Pro Val Glu Ser Met Glu Thr Thr Met Arg Ser Pro Val 
210 215 220 

TTC ACG GAC AAC TCA TCC CCC CCG GCC GTA CCG CAG TCA TTT CAA GTG 720 
Phe Thr Asp Asn Ser Ser Pro Pro Ala Val Pro Gin Ser Phe Gin Val 
225 230 235 240 

GCC CAC CTA CAC GCT CCC ACT GGC AGC GGC AAG AGT ACT AAA GTG CCG 768 
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Ala His Leu His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro 
245 250 255 

GCT GCA TAT GCA GCC CAA GGG TAC AAG GTG CTC GTC CTC AAT CCG TCC 816 
Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro Ser 
260 265 270 

GTT GCC GCT ACC TTA GGG TTT GGG GCG TAT ATG TCT AAG GCA CAC GGT 864 
Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His Gly 
275 280 285 

ATT GAC CCC AAC ATC AGA ACT GGG GTA AGG ACC ATT ACC AC A GGC GCC 912 
He Asp Pro Asn He Arg Thr Gly Val Arg Thr He Thr Thr Gly Ala 
290 295 300 

CCC GTC AC A TAC TCT ACC TAT GGC AAG TTT CTT GCC GAT GGT GGT TGC 960 
Pro Val Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys 
305 310 315 320 

TCT GGG GGC GCT TAT GAC ATC ATA ATA TGT GAT GAG TGC CAT TCA ACT 1008 
Ser Gly Gly Ala Tyr Asp He He He Cys Asp Glu Cys His Ser Thr 
325 330 335 

GAC TCG ACT AC A ATC TTG GGC ATC GGC ACA GTC CTG GAC CAA GCG GAG 1056 
Asp Ser Thr Thr He Leu Gly He Gly Thr Val Leu Asp Gin Ala Glu 
340 345 350 

ACG GCT GGA GCG CGG CTT GTC GTG CTC GCC ACC GCT ACG CCT CCG GGA 1104 
Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly 
355 360 365 

TCG GTC ACC GTG CCA CAC CCA AAC ATC GAG GAG GTG GCC CTG TCT AAT 1152 
Ser Val Thr Val Pro His Pro Asn He Glu Glu Val Ala Leu Ser Asn 
370 375 380 

ACT GGA GAG ATC CCC TTC TAT GGC AAA GCC ATC CCC ATT GAA GCC ATC 1200 
Thr Gly Glu He Pro Phe Tyr Gly Lys Ala He Pro He Glu Ala He 
385 390 395 400 

AGG GGG GGA AGG CAT CTC ATT TTC TGT CAT TCC AAG AAG AAG TGC GAC 1248 
Arg Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys Asp 
405 410 415 

GAG CTC GCC GCA AAG CTG TCA GGC CTC GGA ATC AAC GCT GTG GCG TAT 1296 
Glu Leu Ala Ala Lys Leu Ser Gly Leu Gly He Asn Ala Val Ala Tyr 
420 425 430 

TAC CGG GGG CTC GAT GTG TCC GTC ATA CCA ACT ATC GGA GAC GTC GTT 1344 
Tyr Arg Gly Leu Asp Val Ser Val He Pro Thr He Gly Asp Val Val 
435 440 445 

GTC GTG GCA ACA GAC GCT CTG ATG ACG GGC TAT ACG GGC GAC TTT GAC 1392 
Val Val Ala Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp 
450 455 460 

TCA GTG ATC GAC TGT AAC ACA TGT GTC ACC CAG ACA GTC GAC TTC AGC 1440 
Ser Val He Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe Ser 
465 470 475 480 



SUBSTITUTE SHEET ( rule 26 ) 



WO 99/28482 



113 



PCT/US98/24528 



TTG GAT CCC ACC TTC ACC ATT GAG ACG ACG ACC GTG CCT CAA GAC GCA 1488 
Leu Asp Pro Thr Phe Thr He Glu Thr Thr Thr Val Pro Gin Asp Ala 
485 490 495 

GTG TCG CGC TCG CAG CGG CGG GGT AGG ACT GGC AGG GGT AGG AGA GGC 1536 
Val Ser Arg Ser Gin Arg Arg Gly Arg Thr Gly Arg Gly Arg Arg Gly 
500 505 510 

ATC TAC AGG TTT GTG ACT CCG GGA GAA CGG CCC TCG GGC ATG TTC GAT 1584 
He Tyr Arg Phe Val Thr Pro Gly Glu Arg Pro Ser Gly Met Phe Asp 
515 520 525 

TCC TCG GTC CTG TGT GAG TGC TAT GAC GCG GGC TGT GCT TGG TAC GAG 1632 
Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu 
530 535 540 

CTC ACC CCC GCC GAG ACC TCG GTT AGG TTG CGG GCC TAC CTG AAC ACA 1680 
Leu Thr Pro Ala Glu Thr Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr 
545 550 555 560 

CCA GGG TTG CCC GTT TGC CAG GAC CAC CTG GAG TTC TGG GAG AGT GTC 1728 
Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Ser Val 
565 570 575 

TTC ACA GGC CTC ACC CAT ATA GAT GCA CAC TTC TTG TCC CAG ACC AAG 1776 
Phe Thr Gly Leu Thr His He Asp Ala His Phe Leu Ser Gin Thr Lys 
580 585 590 

CAG GCA GGA GAC AAC TTC CCC TAC CTG GTA GCA TAC CAA GCC ACG GTG 1824 
Gin Ala Gly Asp Asn Phe Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val 
595 600 605 

TGC GCC AGG GCT CAG GCC CCA CCT CCA TCA TGG GAT CAA ATG TGG AAG 1872 
Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp Lys 
610 615 620 

TGT CTC ATA CGG CTG AAA CCT ACG CTG CAC GGG CCA ACA CCC TTG CTG 1920 
Cys Leu He Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu Leu 
625 630 635 640 

TAC AGG CTG GGA GCC GTC CAA AAT GAG GTC ACC CTC ACC CAC CCC ATA 1968 
Tyr Arg Leu Gly Ala Val Gin Asn Glu Val Thr Leu Thr His Pro He 
645 650 655 

ACC AAA TAC ATC ATG GCA TGC ATG TCG GCT GAC CTG GAG GTC GTC 2013 
Thr Lys Tyr He Met Ala Cys Met Ser Ala Asp Leu Glu Val Val 
660 665 670 

ACT 2016 



(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2016 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1..2013 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 

ATG GGC AGC AGC CAT CAT CAT CAT CAT CAC AGC AGC GGC CTG GTG CCG 48 
Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

CGC GGC AGC CAT ATG GCT TAC TCT CTG ACT ACG GGT TCT GTT GTT ATT 96 
Arg Gly Ser His Met Ala Tyr Ser Leu Thr Thr Gly Ser Val Val He 
20 25 30 

GTT GGT AGA ATT ATT TTA TCT GGT AGT GGT AGT ATC ACG GCC TAC TCC 144 
Val Gly Arg He He Leu Ser Gly Ser Gly Ser He Thr Ala Tyr Ser 
35 40 45 

CAA CAG ACG CGG GGC CTA CTT GGT TGC ATC ATC ACT AGC CTT ACA GGC 192 
Gin Gin Thr Arg Gly Leu Leu Gly Cys lie He Thr Ser Leu Thr Gly 
50 55 60 

CGG GAC AAG AAC CAG GTC GAG GGA GAG GTT CAG GTG GTT TCC ACC GCA 240 
Arg Asp Lys Asn Gin Val Glu Gly Glu Val Gin Val Val Ser Thr Ala 
65 70 75 80 

ACA CAA TCC TTC CTG GCG ACC TGC GTC AAC GGC GTG TGT TGG ACC GTT 288 
Thr Gin Ser Phe Leu Ala Thr Cys Val Asn Gly Val Cys Trp Thr Val 
85 90 95 

TAC CAT GGT GCT GGC TCA AAG ACC TTA GCC GGC CCA AAG GGG CCA ATC 336 
Tyr His Gly Ala Gly Ser Lys Thr Leu Ala Gly Pro Lys Gly Pro He 
100 105 110 

ACC CAG ATG TAC ACT AAT GTG GAC CAG GAC CTC GTC GGC TGG CAG GCG 384 
Thr Gin Met Tyr Thr Asn Val Asp Gin Asp Leu Val Gly Trp Gin Ala 
115 120 125 

CCC CCC GGG GCG CGT TCC TTG ACA CCA TGC ACC TGT GGC AGC TCA GAC 432 
Pro Pro Gly Ala Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp 
130 135 140 

CTT TAC TTG GTC ACG AGA CAT GCT GAC GTC ATT CCG GTG CGC CGG CGG 480 
Leu Tyr Leu Val Thr Arg His Ala Asp Val He Pro Val Arg Arg Arg 
145 150 155 160 

GGC GAC AGT AGG GGG AGC CTG CTC TCC CCC AGG CCT GTC TCC TAC TTG 528 
Gly Asp Ser Arg Gly Ser Leu Leu Ser Pro Arg Pro Val Ser Tyr Leu 
165 170 175 

AAG GGC TCT GCT GGT GGT CCA CTG CTC TGC CCT TCG GGG CAC GCT GTG 576 
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Lys Gly Ser Ala Gly Gly Pro Leu Leu Cys Pro Ser Gly His Ala Val 
180 185 190 

GGC ATC TTC CGG GCT GCC GTA TGC ACC CGG GGG GTT GCG AAG GCG GTG 624 
Gly He Phe Arg Ala Ala Val Cys Thr Arg Gly Val Ala Lys Ala Val 
195 200 205 

GAC TTT GTG CCC GTA GAG TCC ATG GAA ACT ACT ATG CGG TCT CCG GTC 672 
Asp Phe Val Pro Val Glu Ser Met Glu Thr Thr Met Arg Ser Pro Val 
210 215 220 

TTC ACG GAC AAC TCA TCC CCC CCG GCC GTA CCG CAG TCA TTT CAA GTG 720 
Phe Thr Asp Asn Ser Ser Pro Pro Ala Val Pro Gin Ser Phe Gin Val 
225 230 235 240 

GCC CAC CTA CAC GCT CCC ACT GGC AGC GGC AAG AGT ACT AAA GTG CCG 768 
Ala His Leu His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro 
245 250 255 

GCT GCA TAT GCA GCC CAA GGG TAC AAG GTG CTC GTC CTC AAT CCG TCC 816 
Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro Ser 
260 265 270 

GTT GCC GCT ACC TTA GGG TTT GGG GCG TAT ATG TCT AAG GCA CAC GGT 864 
Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His Gly 
275 280 285 

ATT GAC CCC AAC ATC AGA ACT GGG GTA AGG ACC ATT ACC AC A GGC GCC 912 
He Asp Pro Asn He Arg Thr Gly Val Arg Thr He Thr Thr Gly Ala 
290 295 300 

CCC GTC ACA TAC TCT ACC TAT GGC AAG TTT CTT GCC GAT GGT GGT TGC 960 
Pro Val Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys 
305 310 315 320 

TCT GGG GGC GCT TAT GAC ATC ATA ATA TGT GAT GAG TGC CAT TCA ACT 1008 
Ser Gly Gly Ala Tyr Asp He He He Cys Asp Glu Cys His Ser Thr 
325 330 335 

GAC TCG ACT ACA ATC TTG GGC ATC GGC ACA GTC CTG GAC CAA GCG GAG 1056 
Asp Ser Thr Thr He Leu Gly He Gly Thr Val Leu Asp Gin Ala Glu 
340 345 350 

ACG GCT GGA GCG CGG CTT GTC GTG CTC GCC ACC GCT ACG CCT CCG GGA 1104 
Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly 
355 360 365 

TCG GTC ACC GTG CCA CAC CCA AAC ATC GAG GAG GTG GCC CTG TCT AAT 1152 
Ser Val Thr Val Pro His Pro Asn He Glu Glu Val Ala Leu Ser Asn 
370 375 380 

ACT GGA GAG ATC CCC TTC TAT GGC AAA GCC ATC CCC ATT GAA GCC ATC 1200 
Thr Gly Glu He Pro Phe Tyr Gly Lys Ala He Pro He Glu Ala He 
385 390 395 400 

AGG GGG GGA AGG CAT CTC ATT TTC TGT CAT TCC AAG AAG AAG TGC GAC 1248 
Arg Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys Asp 
405 410 415 
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GAG CTC GCC GCA AAG CTG TCA GGC CTC GGA ATC AAC GCT GTG GCG TAT 1296 
Glu Leu Ala Ala Lys Leu Ser Gly Leu Gly lie Asn Ala Val Ala Tyr 
420 425 430 

TAC CGG GGG CTC GAT GTG TCC GTC ATA CCA ACT ATC GGA GAC GTC GTT 1344 
Tyr Arg Gly Leu Asp Val Ser Val He Pro Thr He Gly Asp Val Val 
435 440 445 

GTC GTG GCA ACA GAC GCT CTG ATG ACG GGC TAT ACG GGC GAC TTT GAC 1392 
Val Val Ala Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp 
450 455 460 

TCA GTG ATC GAC TGT AAC ACA TGT GTC ACC CAG ACA GTC GAC TTC AGC 1440 
Ser Val He Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe Ser 
465 470 475 480 

TTG GAT CCC ACC TTC ACC ATT GAG ACG ACG ACC GTG CCT CAA GAC GCA 1488 
Leu Asp Pro Thr Phe Thr He Glu Thr Thr Thr Val Pro Gin Asp Ala 
485 490 495 

GTG TCG CGC TCG CAG CGG CGG GGT AGG ACT GGC AGG GGT AGG AG A GGC 1536 
Val Ser Arg Ser Gin Arg Arg Gly Arg Thr Gly Arg Gly Arg Arg Gly 
500 505 510 

ATC TAC AGG TTT GTG ACT CCG GGA GAA CGG CCC TCG GGC ATG TTC GAT 1584 
He Tyr Arg Phe Val Thr Pro Gly Glu Arg Pro Ser Gly Met Phe Asp 
515 520 525 

TCC TCG GTC CTG TGT GAG TGC TAT GAC GCG GGC TGT GCT TGG TAC GAG 1632 
Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu 
530 535 540 

CTC ACC CCC GCC GAG ACC TCG GTT AGG TTG CGG GCC TAC CTG AAC ACA 1680 
Leu Thr Pro Ala Glu Thr Ser Val Arg Leu Arg Ala Tyr Leu Asn Thr 
545 550 555 560 

CCA GGG TTG CCC GTT TGC CAG GAC CAC CTG GAG TTC TGG GAG AGT GTC 1728 
Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Ser Val 
565 570 575 

TTC ACA GGC CTC ACC CAT ATA GAT GCA CAC TTC TTG TCC CAG ACC AAG 1776 
Phe Thr Gly Leu Thr His He Asp Ala His Phe Leu Ser Gin Thr Lys 
580 585 590 

CAG GCA GGA GAC AAC TTC CCC TAC CTG GTA GCA TAC CAA GCC ACG GTG 1824 
Gin Ala Gly Asp Asn Phe Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val 
595 600 605 

TGC GCC AGG GCT CAG GCC CCA CCT CCA TCA TGG GAT CAA ATG TGG AAG 1872 
Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp Lys 
610 615 620 

TGT CTC ATA CGG CTG AAA CCT ACG CTG CAC GGG CCA ACA CCC TTG CTG 1920 
Cys Leu He Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu Leu 
625 630 635 640 

TAC AGG CTG GGA GCC GTC CAA AAT GAG GTC ACC CTC ACC CAC CCC ATA 1968 
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Tyr Arg Leu Gly Ala Val Gin Asn Glu Val Thr Leu Thr His Pro lie 
645 650 655 



ACC AAA TAG ATC ATG GCA TGC ATG TCG GCT GAC CTG GAG GTC GTC 
Thr Lys Tyr He Met Ala Cys Met Ser Ala Asp Leu Glu Val Val 
660 665 670 



2013 



ACT 



2016 



(2) INFORMATION FOR SEQ ID NO: 112: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 648 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..648 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 

ATG GGC AGC AGC CAT CAT CAT CAT CAT CAC AGC AGC GGC CTG GTG CCG 48 
Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

CGC GGC AGC CAT ATG GGT TCT GTT GTT ATT GTT GGT AGA ATT ATT TTA 96 
Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

TCT CCT GCT GGT ATC ACG GCC TAC TCC CAA CAG ACG CGG GGC CTA CTT 144 
Ser Pro Ala Gly He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu 
35 40 45 

GGT TGC ATC ATC ACT AGC CTT AC A GGC CGG GAC AAG AAC CAG GTC GAG 192 
Gly Cys He He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu 
50 55 60 

GGA GAG GTT CAG GTG GTT TCC ACC GCA ACA CAA TCC TTC CTG GCG ACC 240 
Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr 
65 70 75 80 

TGC GTC AAC GGC GTG TGT TGG ACC GTT TAC CAT GGT GCT GGC TCA AAG 288 
Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys 
85 90 95 

ACC TTA GCC GGC CCA AAG GGG CCA ATC ACC CAG ATG TAC ACT AAT GTG 336 
Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn Val 
100 105 110 

GAC CAG GAC CTC GTC GGC TGG CAG GCG CCC CCC GGG GCG CGT TCC TTG 384 
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Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu 
115 120 125 

ACA CCA TGC ACC TGT GGC AGC TCA GAC CTT TAC TTG GTC ACG AGA CAT 432 
Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His 
130 135 140 

GCT GAC GTC ATT CCG GTG CGC CGG CGG GGC GAC AGT AGG GGG AGC CTG 480 
Ala Asp Val lie Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu 
145 150 155 160 

CTC TCC CCC AGG CCT GTC TCC TAC TTG AAG GGC TCT TCG GGT GGT CCA 528 
Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro 
165 170 175 

CTG CTC TGC CCT TCG GGG CAC GCT GTG GGC ATC TTC CGG GCT GCC GTA 576 
Leu Leu Cys Pro Ser Gly His Ala Val Gly lie Phe Arg Ala Ala Val 
180 185 190 

TGC ACC CGG GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTA GAG TCC 624 
Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser 
195 200 205 

ATG GAA ACT ACT ATG CGG TCT TGA 648 
Met Glu Thr Thr Met Arg Ser * 
210 215 



(2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 648 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1..640 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113 

ATG GGC AGC AGC CAT CAT CAT CAT CAT CAC AGC AGC GGC CTG GTG CCG 48 
Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

CGC GGC AGC CAT ATG GGT TCT GTT GTT ATT GTT GGT AGA ATT ATT TTA 96 
Arg Gly Ser His Met Gly Ser Val Val lie Val Gly Arg He He Leu 
20 25 30 

TCT CCT GCT GGT ATC ACG GCC TAC TCC CAA CAG ACG CGG GGC CTA CTT 144 
Ser Pro Ala Gly He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu 
35 40 45 
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GGT TGC AAG ATC ACT AGC CTT ACA GGC CGG GAC AAG AAC CAG GTC GAG ■ 192 

Gly Cys Lys He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu 
50 55 60 

GGA GAG GTT CAG GTG GTT TCC ACC GCA ACA CAA TCC TTC CTG GCG ACC 240 
Gly Glu Val Gin Val Val Ser Thr Ala. Thr Gin Ser Phe Leu Ala Thr 
65 70 75 80 

TGC GTC AAC GGC GTG TGT TGG ACC GTT TAC CAT GGT GCT GGC TCA AAG 288 
Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys 
85 90 95 

ACC TTA GCC GGC CCA AAG GGG CCA ATC ACC CAG ATG TAC ACT AAT GTG 336 
Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn Val 
100 105 110 

GAC CAG GAC CTC GTC GGC TGG CAG GCG CCC CCC GGG GCG CGT TCC TTG 384 
Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu 
115 120 125 

ACA CCA TGC ACC TGT GGC AGC TCA GAC CTT TAC TTG GTC ACG AGA CAT 432 
Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His 
130 135 140 

GCT GAC GTC ATT CCG GTG CGC CGG CGG GGC GAC AGT AGG GGG AGC CTG 480 
Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu 
145 150 155 160 

CTC TCC CCC AGG CCT GTC TCC TAC TTG AAG GGC TCT TCG GGT GGT CCA 528 
Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro 
165 170 175 

CTG CTC TGC CCT TCG GGG CAC GCT GTG GGC ATC TTC CGG GCT GCC GTA 576 
Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala Val 
180 185 190 

TGC ACC CGG GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTA GAG TCC 624 
Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser 
195 200 205 

ATG GAA ACT ACT ATG C GGTCTTGA 648 
Met Glu Thr Thr Met 
210 



(2) INFORMATION FOR SEQ ID NO: 114: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 498 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 
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(A) NAME/ KEY : CDS 

(B) LOCATION: I.. 498 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 

ATG ATC ACT AGC CTT ACA GGC CGG GAC AAG AAC CAG GTC GAG GGA GAG 48 
Met lie Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly Glu 
15 10 15 

GTT CAG GTG GTT TCC ACC GCA ACA CAA TCC TTC CTG GCG ACC TGC GTC 96 
Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys Val 
20 25 30 

AAC GGC GTG TGT TGG ACC GTT TAC CAT GGT GCT GGC TCA AAG ACC TTA 144 
Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys Thr Leu 
35 40 45 

GCC GGC CCA AAG GGG CCA ATC ACC CAG ATG TAC ACT AAT GTG GAC CAG 192 
Ala Gly Pro Lys Gly Pro lie Thr Gin Met Tyr Thr Asn Val Asp Gin 
50 55 60 

GAC CTC GTC GGC TGG CAG GCG CCC CCC GGG GCG CGT TCC TTG ACA CCA 240 
Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu Thr Pro 
65 70 75 80 

TGC ACC TGT GGC AGC TCA GAC CTT TAC TTG GTC ACG AGA CAT GCT GAC 288 
Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala Asp 
85 90 95 

GTC ATT CCG GTG CGC CGG CGG GGC GAC AGT AGG GGG AGC CTG CTC TCC 336 
Val lie Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu Ser 
100 105 110 

CCC AGG CCT GTC TCC TAC TTG AAG GGC TCT TCG GGT GGT CCA CTG CTC 384 
Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu Leu 
115 120 125 

TGC CCT TCG GGG CAC GCT GTG GGC ATC TTC CGG GCT GCC GTA TGC ACC 432 
Cys Pro Ser Gly His Ala Val Gly lie Phe Arg Ala Ala Val Cys Thr 
130 135 140 

CGG GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTA GAG TCC ATG GAA 480 
Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser Met Glu 
145 150 155 160 

ACT ACT ATG CGG TCT TGA 498 
Thr Thr Met Arg Ser * 
165 



(2) INFORMATION FOR SEQ ID NO: 115: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 648 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/ KEY: CDS 
<B) LOCATION: 1. .648 



(xi) SEQUENCE DESCRIPTION: SEQ IDNO:115: 

ATG GGC AGC AGC CAT CAT CAT CAT CAT CAC AGC AGC GGC CTG GTG CCG 48 
Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
1 5 10 15 

CGC GGC AGC CAT ATG GGT TCT GTT GTT ATT GTT GGT AGA ATT ATT TTA 96 
Arg Gly Ser His Met Gly Ser Val Val He Val Gly Arg He He Leu 
20 25 30 

TCT GGT GGT TCT ATC ACG GCC TAC TCC CAA CAG ACG CGG GGC CTA CTT 144 
Ser Gly Gly Ser He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu 
35 40 45 

GGT TGC AAG ATC ACT AGC CTT ACA GGC CGG GAC AAG AAC CAG GTC GAG 192 
Gly Cys Lys He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu 
50 55 60 

GGA GAG GTT CAG GTG GTT TCC ACC GCA ACA CAA TCC TTC CTG GCG ACC 240 
Gly Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr 
65 70 75 80 

TGC GTC AAC GGC GTG TGT TGG ACC GTT TAC CAT GGT GCT GGC TCA AAG 288 
Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys 
85 90 95 

ACC TTA GCC GGC CCA AAG GGG CCA ATC ACC CAG ATG TAC ACT AAT GTG 336 
Thr Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn Val 
100 105 110 

GAC CAG GAC CTC GTC GGC TGG CAG GCG CCC CCC GGG GCG CGT TCC TTG 384 
Asp Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu 
115 120 125 

ACA CCA TGC ACC TGT GGC AGC TCA GAC CTT TAC TTG GTC ACG AGA CAT 432 
Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His 
130 135 140 

GCT GAC GTC ATT CCG GTG CGC CGG CGG GGC GAC AGT AGG GGG AGC CTG 480 
Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu 
145 150 155 160 

CTC TCC CCC AGG CCT GTC TCC TAC TTG AAG GGC TCT TCG GGT GGT CCA 528 
Leu Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro 
165 170 175 

CTG CTC TGC CCT TCG GGG CAC GCT GTG GGC ATC TTC CGG GCT GCC GTA 576 
Leu Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala Val 
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180 185 190 

TGC ACC CGG GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTA GAG TCC 624 
Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser 
195 200 205 

ATG GAA ACT ACT ATG CGG TCT TGA 648 
Met Glu Thr Thr Met Arg Ser * 
210 215 



(2) INFORMATION FOR SEQ ID NO: 116: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2007 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..2004 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116 : 

ATG CAT ATG CAT CAT CAT CAC CAT CAT CTG GTG CCG CGC GGC AGC GCG 48 
Met His Met His His His His His His Leu Val Pro Arg Gly Ser Ala 
15 10 15 

CCC ATC ACG GCC TAC TCC CAA CAG ACG CGG GGC CTA CTT GGT TGC AAG 96 
Pro lie Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly Cys Lys 
20 25 30 

ATC ACT AGC CTT ACA GGC CGG GAC AAG AAC CAG GTC GAG GGA GAG GTT 144 
lie Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly Glu Val 
35 40 45 

CAG GTG GTT TCC ACC GCA ACA CAA TCC TTC CTG GCG ACC TGC GTC AAC 192 
Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys Val Asn 
50 55 60 

GGC GTG TGT TGG ACC GTT TAC CAT GGT GCT GGC TCA AAG ACC TTA GCC 240 
Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys Thr Leu Ala 
65 70 75 80 

GGC CCA AAG GGG CCA ATC ACC CAG ATG TAC ACT AAT GTG GAC CAG GAC 288 
Gly Pro Lys Gly Pro lie Thr Gin Met Tyr Thr Asn Val Asp Gin Asp 
85 90 95 

CTC GTC GGC TGG CAG GCG CCC CCC GGG GCG CGT TCC TTG ACA CCA TGC 336 
Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu Thr Pro Cys 
100 105 110 
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ACC TGT GGC AGC TCA GAC CTT TAC TTG GTC ACG AGA CAT GCT GAC GTC 3 84 

Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala Asp Val 
115 120 125 

ATT CCG GTG CGC CGG CGG GGC GAC AGT AGG GGG AGC CTG CTC TCC CCC 432 
lie Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu Ser Pro 
130 135 140 

AGG CCT GTC TCC TAC TTG AAG GGC TCT TCG GGT GGT CCA CTG CTC TGC 480 
Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu Leu Cys 
145 150 155 160 

CCT TCG GGG CAC GCT GTG GGC ATC TTC CGG GCT GCC GTA TGC ACC CGG 528 
Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala Val Cys Thr Arg 
165 170 175 

GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTA GAG TCC ATG GAA ACT 576 
Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser Met Glu Thr 
180 185 190 

ACT ATG CGG TCT CCG GTC TTC ACG GAC AAC TCA TCC CCC CCG GCC GTA 624 
Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro Ala Val 
195 200 205 

CCG CAG TCA TTT CAA GTG GCC CAC CTA CAC GCT CCC ACT GGC AGC GGC 672 
Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr Gly Ser Gly 
210 215 220 

AAG AGT ACT AAA GTG CCG GCT GCA TAT GCA GCC CAA GGG TAC AAG GTG 720 
Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val 
225 230 235 240 

CTC GTC CTC AAT CCG TCC GTT GCC GCT ACC TTA GGG TTT GGG GCG TAT 768 
Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr 
245 250 255 

ATG TCT AAG GCA CAC GGT ATT GAC CCC AAC ATC AGA ACT GGG GTA AGG 816 
Met Ser Lys Ala His Gly He Asp Pro Asn He Arg Thr Gly Val Arg 
260 265 270 

ACC ATT ACC ACA GGC GCC CCC GTC AC A TAC TCT ACC TAT GGC AAG TTT 864 
Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr Tyr Gly Lys Phe 
275 280 285 

CTT GCC GAT GGT GGT TGC TCT GGG GGC GCT TAT GAC ATC ATA ATA TGT 912 
Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He He He Cys 
290 295 300 

GAT GAG TGC CAT TCA ACT GAC TCG ACT ACA ATC TTG GGC ATC GGC ACA 960 
Asp Glu Cys His Ser Thr Asp Ser Thr Thr He Leu Gly He Gly Thr 
305 310 315 320 

GTC CTG GAC CAA GCG GAG ACG GCT GGA GCG CGG CTT GTC GTG CTC GCC 1008 
Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala 
325 330 335 

ACC GCT ACG CCT CCG GGA TCG GTC ACC GTG CCA CAC CCA AAC ATC GAG 1056 
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Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn He Glu 
340 345 350 

GAG GTG GCC CTG TCT AAT ACT GGA GAG ATC CCC TTC TAT GGC AAA GCC 1104 
Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe Tyr Gly Lys Ala 
355 360 365 

ATC CCC ATT GAA GCC ATC AGG GGG GGA AGG CAT CTC ATT TTC TGT CAT 1152 
He Pro He Glu Ala He Arg Gly Gly Arg His Leu He Phe Cys His 
370 375 380 

TCC AAG AAG AAG TGC GAC GAG CTC GCC GCA AAG CTG TCA GGC CTC GGA 1200 
Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Ser Gly Leu Gly 
385 390 395 400 

ATC AAC GCT GTG GCG TAT TAC CGG GGG CTC GAT GTG TCC GTC ATA CCA 1248 
He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro 
405 410 415 

ACT ATC GGA GAC GTC GTT GTC GTG GCA ACA GAC GCT CTG ATG ACG GGC 1296 
Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly 
420 425 430 

TAT ACG GGC GAC TTT GAC TCA GTG ATC GAC TGT AAC ACA TGT GTC ACC 1344 
Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val Thr 
435 440 445 

CAG ACA GTC GAC TTC AGC TTG GAT CCC ACC TTC ACC ATT GAG ACG ACG 1392 
Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr Thr 
450 455 460 

ACC GTG CCT CAA GAC GCA GTG TCG CGC TCG CAG CGG CGG GGT AGG ACT 1440 
Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly Arg Thr 
465 470 475 480 

GGC AGG GGT AGG AGA GGC ATC TAC AGG TTT GTG ACT CCG GGA GAA CGG 1488 
Gly Arg Gly Arg Arg Gly lie Tyr Arg Phe Val Thr Pro Gly Glu Arg 
485 490 495 

CCC TCG GGC ATG TTC GAT TCC TCG GTC CTG TGT GAG TGC TAT GAC GCG 1536 
Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala 
500 505 510 

GGC TGT GCT TGG TAC GAG CTC ACC CCC GCC GAG ACC TCG GTT AGG TTG 1584 
Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val Arg Leu 
515 520 525 

CGG GCC TAC CTG AAC ACA CCA GGG TTG CCC GTT TGC CAG GAC CAC CTG 1632 
Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu 
530 535 540 

GAG TTC TGG GAG AGT GTC TTC ACA GGC CTC ACC CAT ATA GAT GCA CAC 1680 
Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His He Asp Ala His 
545 550 555 560 

TTC TTG TCC CAG ACC AAG CAG GCA GGA GAC AAC TTC CCC TAC CTG GTA 1728 
Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr Leu Val 
565 570 575 
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GCA TAC CAA GCC ACG GTG TGC GCC AGG GCT CAG GCC CCA CCT CCA TCA 1776 
Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser 
580 585 590 

TGG GAT CAA ATG TGG AAG TGT CTC ATA CGG CTG AAA CCT ACG CTG CAC 1824 
Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys Pro Thr Leu His 
595 600 605 

GGG CCA AC A CCC TTG CTG TAC AGG CTG GGA GCC GTC CAA AAT GAG GTC 1872 
Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu Val 
610 615 620 

ACC CTC ACC CAC CCC ATA ACC AAA TAC ATC ATG GCA TGC ATG TCG GCC 1920 
Thr Leu Thr His Pro lie Thr Lys Tyr lie Met Ala Cys Met Ser Ala 
625 630 635 640 

GAC CTG GAG GTC GTT ACG TAG GAA TTC GAG CTC CGT CGA CAA GCT TGC 1968 
Asp Leu Glu Val Val Thr * Glu Phe Glu Leu Arg Arg Gin Ala Cys 
645 650 655 

GGC CGC ACT CGA GCA CCA CCA CCA CCA CCA CTG AGA TCC 2007 
Gly Arg Thr Arg Ala Pro Pro Pro Pro Pro Leu Arg 
660 665 



(2) INFORMATION FOR SEQ ID NO: 117: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2007 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..2004 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 

ATG CAT ATG CAT CAT CAT CAC CAT CAT CTG GTG CCG CGC GGC AGC GCG 48 
Met His Met His His His His His His Leu Val Pro Arg Gly Ser Ala 
15 10 15 

CCC ATC ACG GCC TAC TCC CAA CAG ACG CGG GGC CTA CTT GGT TGC ATC 96 
Pro lie Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly Cys lie 
20 25 30 

AAG ACT AGC CTT ACA GGC CGG GAC AAG AAC CAG GTC GAG GGA GAG GTT 144 
Lys Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly Glu Val 
35 40 45 
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CAG GTG GTT TCC ACC GCA ACA CAA TCC TTC CTG GCG ACC TGC GTC AAC 192 
Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys Val Asn 
50 55 60 

GGC GTG TGT TGG ACC GTT TAC CAT GGT GCT GGC TCA AAG ACC TTA GCC 240 
Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys Thr Leu Ala 
65 70 75 80 

GGC CCA AAG GGG CCA ATC ACC CAG ATG TAC ACT AAT GTG GAC CAG GAC 288 
Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn Val Asp Gin Asp 
85 90 95 

CTC GTC GGC TGG CAG GCG CCC CCC GGG GCG CGT TCC TTG ACA CCA TGC 336 
Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu Thr Pro Cys 
100 105 110 

ACC TGT GGC AGC TCA GAC CTT TAC TTG GTC ACG AGA CAT GCT GAC GTC 384 
Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala Asp Val 
115 120 125 

ATT CCG GTG CGC CGG CGG GGC GAC AGT AGG GGG AGC CTG CTC TCC CCC 432 
He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu Ser Pro 
130 135 140 

AGG CCT GTC TCC TAC TTG AAG GGC TCT TCG GGT GGT CCA CTG CTC TGC 480 
Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu Leu Cys 
145 150 155 160 

CCT TCG GGG CAC GCT GTG GGC ATC TTC CGG GCT GCC GTA TGC ACC CGG 528 
Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala Val Cys Thr Arg 
165 170 175 

GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTA GAG TCC ATG GAA ACT 576 
Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser Met Glu Thr 
180 185 190 

ACT ATG CGG TCT CCG GTC TTC ACG GAC AAC TCA TCC CCC CCG GCC GTA 624 
Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro Ala Val 
195 200 205 

CCG CAG TCA TTT CAA GTG GCC CAC CTA CAC GCT CCC ACT GGC AGC GGC 672 
Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr Gly Ser Gly 
210 215 220 

AAG AGT ACT AAA GTG CCG GCT GCA TAT GCA GCC CAA GGG TAC AAG GTG 720 
Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val 
225 230 235 240 

CTC GTC CTC AAT CCG TCC GTT GCC GCT ACC TTA GGG TTT GGG GCG TAT 768 
Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr 
245 250 255 

ATG TCT AAG GCA CAC GGT ATT GAC CCC AAC ATC AGA ACT GGG GTA AGG 816 
Met Ser Lys Ala His Gly He Asp Pro Asn He Arg Thr Gly Val Arg 
260 265 270 

ACC ATT ACC ACA GGC GCC CCC GTC ACA TAC TCT ACC TAT GGC AAG TTT 864 
Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr Tyr Gly Lys Phe 
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275 280 285 

CTT GCC GAT GGT GGT TGC TCT GGG GGC GCT TAT GAC ATC ATA ATA TGT 912 
Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp lie lie lie Cys 
290 295 300 

GAT GAG TGC CAT TCA ACT GAC TCG ACT ACA ATC TTG GGC ATC GGC AC A 960 
Asp Glu Cys His Ser Thr Asp Ser Thr Thr lie Leu Gly lie Gly Thr 
305 310 315 320 

GTC CTG GAC CAA GCG GAG ACG GCT GGA GCG CGG CTT GTC GTG CTC GCC 1008 
Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala 
325 330 335 

ACC GCT ACG CCT CCG GGA TCG GTC ACC GTG CCA CAC CCA AAC ATC GAG 1056 
Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn lie Glu 
340 345 350 

GAG GTG GCC CTG TCT AAT ACT GGA GAG ATC CCC TTC TAT GGC AAA GCC 1104 
Glu Val Ala Leu Ser Asn Thr Gly Glu lie Pro Phe Tyr Gly Lys Ala 
355 360 365 

ATC CCC ATT GAA GCC ATC AGG GGG GGA AGG CAT CTC ATT TTC TGT CAT 1152 
lie Pro He Glu Ala He Arg Gly Gly Arg His Leu He Phe Cys His 
370 375 380 

TCC AAG AAG AAG TGC GAC GAG CTC GCC GCA AAG CTG TCA GGC CTC GGA 1200 
Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Ser Gly Leu Gly 
385 390 395 400 

ATC AAC GCT GTG GCG TAT TAG CGG GGG CTC GAT GTG TCC GTC ATA CCA 1248 
He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro 
405 410 415 

ACT ATC GGA GAC GTC GTT GTC GTG GCA ACA GAC GCT CTG ATG ACG GGC 1296 
Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly 
420 425 430 

TAT ACG GGC GAC TTT GAC TCA GTG ATC GAC TGT AAC ACA TGT GTC ACC 1344 
Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val Thr 
435 440 445 

CAG ACA GTC GAC TTC AGC TTG GAT CCC ACC TTC ACC ATT GAG ACG ACG 1392 
Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr Thr 
450 455 460 

ACC GTG CCT CAA GAC GCA GTG TCG CGC TCG CAG CGG CGG GGT AGG ACT 1440 
Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly Arg Thr 
465 470 475 480 

GGC AGG GGT AGG AGA GGC ATC TAC AGG TTT GTG ACT CCG GGA GAA CGG 1488 
Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr Pro Gly Glu Arg 
485 490 495 

CCC TCG GGC ATG TTC GAT TCC TCG GTC CTG TGT GAG TGC TAT GAC GCG 1536 
Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala 
500 505 510 
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GGC TGT GCT TGG TAC GAG CTC ACC CCC GCC GAG ACC TCG GTT AGG TTG 1584 
Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val Arg Leu 
515 520 525 

CGG GCC TAC CTG AAC ACA CCA GGG TTG CCC GTT TGC CAG GAC CAC CTG 1632 
Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu 
530 535 540 

GAG TTC TGG GAG AGT GTC TTC ACA GGC CTC ACC CAT ATA GAT GCA CAC 1680 
Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His He Asp Ala His 
545 550 555 560 

TTC TTG TCC CAG ACC AAG CAG GCA GGA GAC AAC TTC CCC TAC CTG GTA 1728 
Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr Leu Val 
565 570 575 

GCA TAC CAA GCC ACG GTG TGC GCC AGG GCT CAG GCC CCA CCT CCA TCA 1776 
Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser 
580 585 590 

TGG GAT CAA ATG TGG AAG TGT CTC ATA CGG CTG AAA CCT ACG CTG CAC 1824 
Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro Thr Leu His 
595 600 605 

GGG CCA ACA CCC TTG CTG TAC AGG CTG GGA GCC GTC CAA AAT GAG GTC 1872 
Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu Val 
610 615 620 

ACC CTC ACC CAC CCC ATA ACC AAA TAC ATC ATG GCA TGC ATG TCG GCC 1920 
Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala Cys Met Ser Ala 
625 630 635 640 

GAC CTG GAG GTC GTT ACG TAG GAA TTC GAG CTC CGT CGA CAA GCT TGC 1968 
Asp Leu Glu Val Val Thr * Glu Phe Glu Leu Arg Arg Gin Ala Cys 
645 650 655 

GGC CGC ACT CGA GCA CCA CCA CCA CCA CCA CTG AGA TCC 2007 
Gly Arg Thr Arg Ala Pro Pro Pro Pro Pro Leu Arg 
660 665 



(2) INFORMATION FOR SEQ ID NO: 118: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2007 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..2004 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 
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ATG CAT ATG CAT CAT CAT CAC CAT CAT CTG GTG CCG CGC GGC AGC GCG 48 
Met His Met His His His His His His Leu Val Pro Arg Gly Ser Ala 
15 10 15 

CCC ATC ACG GCC TAC TCC CAA CAG ACG CGG GGC CTA CTT GGT TGC ATC 96 
Pro lie Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly Cys lie 
20 25 30 

ATC ACT AGC CTT AC A GGC CGG GAC AAG AAC CAG GTC GAG GGA GAG GTT 144 
lie Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly Glu Val 
35 40 45 

CAG GTG GTT TCC ACC GCA ACA CAA TCC TTC CTG GCG ACC TGC GTC AAC 192 
Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys Val Asn 
50 55 60 

GGC GTG TGT TGG ACC GTT TAC CAT GGT GCT GGC TCA AAG ACC TTA GCC 240 
Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys Thr Leu Ala 
65 70 75 80 

GGC CCA AAG GGG CCA ATC ACC CAG ATG TAC ACT AAT GTG GAC CAG GAC 2 88 

Gly Pro Lys Gly Pro lie Thr Gin Met Tyr Thr Asn Val Asp Gin Asp 
85 90 95 

CTC GTC GGC TGG CAG GCG CCC CCC GGG GCG CGT TCC TTG ACA CCA TGC 336 
Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu Thr Pro Cys 
100 105 110 

ACC TGT GGC AGC TCA GAC CTT TAC TTG GTC ACG AGA CAT GCT GAC GTC 384 
Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala Asp Val 
115 120 125 

ATT CCG GTG CGC CGG CGG GGC GAC AGT AGG GGG AGC CTG CTC TCC CCC 432 
lie Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu Ser Pro 
130 135 140 

AGG CCT GTC TCC TAC TTG AAG GGC TCT GCT GGT GGT CCA CTG CTC TGC 480 
Arg Pro Val Ser Tyr Leu Lys Gly Ser Ala Gly Gly Pro Leu Leu Cys 
145 150 155 160 

CCT TCG GGG CAC GCT GTG GGC ATC TTC CGG GCT GCC GTA TGC ACC CGG 528 
Pro Ser Gly His Ala Val Gly lie Phe Arg Ala Ala Val Cys Thr Arg 
165 170 175 

GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTA GAG TCC ATG GAA ACT 576 
Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser Met Glu Thr 
180 185 190 

ACT ATG CGG TCT CCG GTC TTC ACG GAC AAC TCA TCC CCC CCG GCC GTA 624 
Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro Ala Val 
195 200 205 

CCG CAG TCA TTT CAA GTG GCC CAC CTA CAC GCT CCC ACT GGC AGC GGC 672 
Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr Gly Ser Gly 
210 215 220 

AAG AGT ACT AAA GTG CCG GCT GCA TAT GCA GCC CAA GGG TAC AAG GTG 720 
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Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val 
225 230 235 240 

CTC GTC CTC AAT CCG TCC GTT GCC GCT ACC TTA GGG TTT GGG GCG TAT 768 
Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr 
245 250 255 

ATG TCT AAG GCA CAC GGT ATT GAC CCC AAC ATC AGA ACT GGG GTA AGG 816 
Met Ser Lys Ala His Gly lie Asp Pro Asn He Arg Thr Gly Val Arg 
260 265 270 

ACC ATT ACC ACA GGC GCC CCC GTC ACA TAC TCT ACC TAT GGC AAG TTT 864 
Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr Tyr Gly Lys Phe 
275 280 285 

CTT GCC GAT GGT GGT TGC TCT GGG GGC GCT TAT GAC ATC ATA . ATA TGT 912 
Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He He He Cys 
290 295 300 

GAT GAG TGC CAT TCA ACT GAC TCG ACT ACA ATC TTG GGC ATC GGC ACA 960 
Asp Glu Cys His Ser Thr Asp Ser Thr Thr He Leu Gly He Gly Thr 
305 310 315 320 

GTC CTG GAC CAA GCG GAG ACG GCT GGA GCG CGG CTT GTC GTG CTC GCC 1008 
Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala 
325 330 335 

ACC GCT ACG CCT CCG GGA TCG GTC ACC GTG CCA CAC CCA AAC ATC GAG 1056 
Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn He Glu 
340 345 350 

GAG GTG GCC CTG TCT AAT ACT GGA GAG ATC CCC TTC TAT GGC AAA GCC 1104 
Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe Tyr Gly Lys Ala 
355 360 365 

ATC CCC ATT GAA GCC ATC AGG GGG GGA AGG CAT CTC ATT TTC TGT CAT 1152 
He Pro He Glu Ala He Arg Gly Gly Arg His Leu He Phe Cys His 
370 375 380 

TCC AAG AAG AAG TGC GAC GAG CTC GCC GCA AAG CTG TCA GGC CTC GGA 1200 
Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Ser Gly Leu Gly 
385 390 395 400 

ATC AAC GCT GTG GCG TAT TAC CGG GGG CTC GAT GTG TCC GTC ATA CCA 1248 
He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro 
405 410 415 

ACT ATC GGA GAC GTC GTT GTC GTG GCA ACA GAC GCT CTG ATG ACG GGC 1296 
Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly 
420 425 430 

TAT ACG GGC GAC TTT GAC TCA GTG ATC GAC TGT AAC ACA TGT GTC ACC 1344 
Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val Thr 
435 440 445 

CAG ACA GTC GAC TTC AGC TTG GAT CCC ACC TTC ACC ATT GAG ACG ACG 1392 
Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr Thr 
450 455 460 
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ACC GTG CCT CAA GAC GCA GTG TCG CGC TCG CAG CGG CGG GGT AGG ACT 1440 
Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly Arg Thr 
465 470 475 480 

GGC AGG GGT AGG AGA GGC ATC TAC AGG TTT GTG ACT CCG GGA GAA CGG 1488 
Gly Arg Gly Arg Arg Gly lie Tyr Arg Phe Val Thr Pro Gly Glu Arg 
485 490 495 

CCC TCG GGC ATG TTC GAT TCC TCG GTC CTG TGT GAG TGC TAT GAC GCG 1536 
Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala 
500 505 510 

GGC TGT GCT TGG TAC GAG CTC ACC CCC GCC GAG ACC TCG GTT AGG TTG 1584 
Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val Arg Leu 
515. 520 525 

CGG GCC TAC CTG AAC AC A CCA GGG TTG CCC GTT TGC CAG GAC CAC CTG 1632 
Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu 
530 535 540 

GAG TTC TGG GAG AGT GTC TTC AC A GGC CTC ACC CAT ATA GAT GCA CAC 1680 
Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His He Asp Ala His 
545 550 555 560 

TTC TTG TCC CAG ACC AAG CAG GCA GGA GAC AAC TTC CCC TAC CTG GTA 1728 
Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr Leu Val 
565 570 575 

GCA TAC CAA GCC ACG GTG TGC GCC AGG GCT CAG GCC CCA CCT CCA TCA 1776 
Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser 
580 585 590 

TGG GAT CAA ATG TGG AAG TGT CTC ATA CGG CTG AAA CCT ACG CTG CAC 1824 
Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro Thr Leu His 
595 600 605 

GGG CCA AC A CCC TTG CTG TAC AGG CTG GGA GCC GTC CAA AAT GAG GTC 1872 
Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu Val 
610 615 620 

ACC CTC ACC CAC CCC ATA ACC AAA TAC ATC ATG GCA TGC ATG TCG GCC 1920 
Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala Cys Met Ser Ala 
625 630 635 640 

GAC CTG GAG GTC GTT ACG TAG GAA TTC GAG CTC CGT CGA CAA GCT TGC 1968 
Asp Leu Glu Val Val Thr * Glu Phe Glu Leu Arg Arg Gin Ala Cys 
645 650 655 

GGC CGC ACT CGA GCA CCA CCA CCA CCA CCA CTG AGA TCC 2 007 

Gly Arg Thr Arg Ala Pro Pro Pro Pro Pro Leu Arg 
660 665 



(2) INFORMATION FOR SEQ ID NO: 119 : 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 2007 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..2004 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 



ATG CAT ATG CAT CAT CAT CAC CAT CAT CTG GTG CCG CGC GGC AGC GCG 48 
Met His Met His His His His His His Leu Val Pro Arg Gly Ser Ala 
15 10 15 



CCC ATC ACG GCC TAC TCC CAA CAG ACG CGG GGC CTA CTT GGT TGC ATC 96 
Pro lie Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly Cys lie 
20 25 30 

ATC ACT AGC CTT ACA GGC CGG GAC AAG AAC CAG GTC GAG GGA GAG GTT 144 
He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly Glu Val 
35 40 45 



CAG GTG GTT TCC ACC GCA ACA CAA TCC TTC CTG GCG ACC TGC GTC AAC 192 
Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys Val Asn 
50 55 60 



GGC GTG TGT TGG ACC GTT TAC CAT GGT GCT GGC TCA AAG ACC TTA GCC 240 
Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys Thr Leu Ala 
65 70 75 80 

GGC CCA AAG GGG CCA ATC ACC CAG ATG TAC ACT AAT GTG GAC CAG GAC 288 
Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn Val Asp Gin Asp 
85 90 95 

CTC GTC GGC TGG CAG GCG CCC CCC GGG GCG CGT TCC TTG ACA CCA TGC 336 
Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu Thr Pro Cys 
100 105 110 

ACC TGT GGC AGC TCA GAC CTT TAC TTG GTC ACG AGA CAT GCT GAC GTC 384 
Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala Asp Val 
115 120 125 



ATT CCG GTG CGC CGG CGG GGC GAC AGT AGG GGG AGC CTG CTC TCC CCC 432 
He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu Ser Pro 
130 135 140 



AGG CCT GTC TCC TAC TTG AAG GGC TCT TCG GGT GGT CCA CTG CTC TGC 480 
Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu Leu Cys 
145 150 155 160 



CCT TCG GGG CAC GCT GTG GGC ATC TTC CGG GCT GCC GTA TGC ACC CGG 528 
Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala Val Cys Thr Arg 
165 170 175 
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GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTA GAG TCC ATG GAA ACT 576 
Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser Met Glu Thr 
180 185 190 

ACT ATG CGG TCT CCG GTC TTC ACG GAC AAC TCA TCC CCC CCG GCC GTA 624 
Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro Ala Val 
195 200 205 

CCG CAG TCA TTT CAA GTG GCC CAC CTA CAC GCT CCC ACT GGC AGC GGC 672 
Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr Gly Ser Gly 
210 215 220 

AAG AGT ACT AAA GTG CCG GCT GCA TAT GCA GCC CAA GGG TAC AAG GTG 720 
Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val 
225 230 235 240 

CTC GTC CTC AAT CCG TCC GTT GCC GCT ACC TTA GGG TTT GGG GCG TAT 768 
Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr 
245 250 255 

ATG TCT AAG GCA CAC GGT ATT GAC CCC AAC ATC AGA ACT GGG GTA AGG 816 
Met Ser Lys Ala His Gly lie Asp Pro Asn lie Arg Thr Gly Val Arg 
260 265 270 

ACC ATT ACC AC A GGC GCC CCC GTC ACA TAC TCT ACC TAT GGC AAG TTT 864 
Thr lie Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr Tyr Gly Lys Phe 
275 280 285 

CTT GCC GAT GGT GGT TGC TCT GGG GGC GCT TAT GAC ATC ATA ATA TGT 912 
Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp lie lie lie Cys 
290 295 300 

GAT GAG TGC CAT TCA ACT GAC TCG ACT ACA ATC TTG GGC ATC GGC ACA 960 
Asp Glu Cys His Ser Thr Asp Ser Thr Thr lie Leu Gly lie Gly Thr 
305 310 315 320 

GTC CTG GAC CAA GCG GAG ACG GCT GGA GCG CGG CTT GTC GTG CTC GCC 1008 
Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala 
325 330 335 

ACC GCT ACG CCT CCG GGA TCG GTC ACC GTG CCA CAC CCA AAC ATC GAG 1056 
Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn lie Glu 
340 345 350 

GAG GTG GCC CTG TCT AAT ACT GGA GAG ATC CCC TTC TAT GGC AAA GCC 1104 
Glu Val Ala Leu Ser Asn Thr Gly Glu lie Pro Phe Tyr Gly Lys Ala 
355 360 365 

ATC CCC ATT GAA GCC ATC AGG GGG GGA AGG CAT CTC ATT TTC TGT CAT 1152 
He Pro He Glu Ala He Arg Gly Gly Arg His Leu He Phe Cys His 
370 375 380 

TCC AAG AAG AAG TGC GAC GAG CTC GCC GCA AAG CTG TCA GGC CTC GGA 1200 
Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Ser Gly Leu Gly 
385 390 395 400 

ATC AAC GCT GTG GCG TAT TAC CGG GGG CTC GAT GTG TCC GTC ATA CCA 1248 
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He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro 
405 410 415 

ACT TCC GGA GAC GTC GTT GTC GTG GCA ACA GAC GCT CTG ATG ACG GGC 1296 
Thr Ser Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly 
420 425 430 

TAT ACG GGC GAC TTT GAC TCA GTG ATC GAC TGT AAC ACA TGT GTC ACC 1344 
Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val Thr 
435 440 445 

CAG ACA GTC GAC TTC AGC TTG GAT CCC ACC TTC ACC ATT GAG ACG ACG 1392 
Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr Thr 
450 455 460 

ACC GTG CCT CAA GAC GCA GTG TCG CGC TCG CAG CGG CGG GGT AGG ACT 1440 
Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly Arg Thr 
465 470 475 480 

GGC AGG GGT AGG AGA GGC ATC TAC AGG TTT GTG ACT CCG GGA GAA CGG 1488 
Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr Pro Gly Glu Arg 
485 490 495 

CCC TCG GGC ATG TTC GAT TCC TCG GTC CTG TGT GAG TGC TAT GAC GCG 1536 
Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala 
500 505 510 

GGC TGT GCT TGG TAC GAG CTC ACC CCC GCC GAG ACC TCG GTT AGG TTG 1584 
Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val Arg Leu 
515 520 525 

CGG GCC TAC CTG AAC ACA CCA GGG TTG CCC GTT TGC CAG GAC CAC CTG 1632 
Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu 
530 535 540 

GAG TTC TGG GAG AGT GTC TTC ACA GGC CTC ACC CAT ATA GAT GCA CAC 1680 
Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His He Asp Ala His 
545 550 555 560 

TTC TTG TCC CAG ACC AAG CAG GCA GGA GAC AAC TTC CCC TAC CTG GTA 1728 
Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr Leu Val 
565 570 575 

GCA TAC CAA GCC ACG GTG TGC GCC AGG GCT CAG GCC CCA CCT CCA TCA 1776 
Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser 
580 585 590 

TGG GAT CAA ATG TGG AAG TGT CTC ATA CGG CTG AAA CCT ACG CTG CAC 1824 
Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro Thr Leu His 
595 600 605 

GGG CCA ACA CCC TTG CTG TAC AGG CTG GGA GCC GTC CAA AAT GAG GTC 1872 
Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu Val 
610 615 620 

ACC CTC ACC CAC CCC ATA ACC AAA TAC ATC ATG GCA TGC ATG TCG GCC 1920 
Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala Cys Met Ser Ala 
625 630 635 640 
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GAC CTG GAG GTC GTT ACG TAG GAA TTC GAG CTC CGT CGA CAA GCT TGC 1968 
Asp Leu Glu Val Val Thr * Glu Phe Glu Leu Arg Arg Gin Ala Cys 
645 650 655 

GGC CGC ACT CGA GCA CCA CCA CCA CCA CCA CTG AGA TCC 2007 
Gly Arg Thr Arg Ala Pro Pro Pro Pro Pro Leu Arg 
660 665 



(2) INFORMATION FOR SEQ ID NO: 120: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2007 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/ KEY : CDS 

(B) LOCATION: 1..2007 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 

ATG CAT ATG CAT CAT CAT CAC CAT CAT CTG GTG CCG CGC GGC AGC GCG 48 
Met His Met His His His His His His Leu Val Pro Arg Gly Ser Ala 
15 10 15 

CCC ATC ACG GCC TAG TCC CAA CAG ACG CGG GGC CTA CTT GGT TGC ATC 96 
Pro lie Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly Cys lie 
20 25 30 

ATC ACT AGC CTT ACA GGC CGG GAC AAG AAC CAG GTC GAG GGA GAG GTT 144 
lie Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly Glu Val 
35 40 45 

CAG GTG GTT TCC ACC GCA ACA CAA TCC TTC CTG GCG ACC TGC GTC AAC 192 
Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys Val Asn 
50 55 60 

GGC GTG TGT TGG ACC GTT TAC CAT GGT GCT GGC TCA AAG ACC TTA GCC 240 
Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys Thr Leu Ala 
65 70 75 80 

GGC CCA AAG GGG CCA ATC ACC CAG ATG TAC ACT AAT GTG GAC CAG GAC 288 
Gly Pro Lys Gly Pro lie Thr Gin Met Tyr Thr Asn Val Asp Gin Asp 
85 90 95 

CTC GTC GGC TGG CAG GCG CCC CCC GGG GCG CGT TCC TTG ACA CCA TGC 336 
Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu Thr Pro Cys 
100 105 110 

ACC TGT GGC AGC TCA GAC CTT TAC TTG GTC ACG AGA CAT GCT GAC GTC 384 
Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala Asp Val 
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115 120 125 

ATT CCG GTG CGC CGG CGG GGC GAC AGT AGG GGG AGC CTG CTC TCC CCC 432 
lie Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu Ser Pro 
130 135 140 

AGG CCT GTC TCC TAC TTG AAG GGC TCT TCG GGT GGT CCA CTG CTC TGC 480 
Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu Leu Cys 
145 150 155 160 

CCT TCG GGG CAC GCT GTG GGC ATC TTC CGG GCT GCC GTA TGC ACC CGG 528 
Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala Val Cys Thr Arg 
165 170 175 

GGG GTT GCG AAG GCG GTG GAC TTT GTG CCC GTA GAG TCC ATG GAA ACT 576 
Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser Met Glu Thr 
180 185 190 

ACT ATG CGG TCT CCG GTC TTC ACG GAC AAC TCA TCC CCC CCG GCC GTA 624 
Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro Ala Val 
195 200 205 

CCG CAG TCA TTT CAA GTG GCC CAC CTA CAC GCT CCC ACT GGC AGC GGC 672 
Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr Gly Ser Gly 
210 215 220 

AAG AGT ACT AAA GTG CCG GCT GCC TAC GCA GCC CAA GGG TAC AAG GTG 720 
Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val 
225 230 235 240 

CTC GTC CTC AAT CCG TCC GTT GCC GCT ACC TTA GGG TTT GGG GCG TAT 768 
Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr 
245 250 255 

ATG TCT AAG GCA CAC GGT ATT GAC CCC AAC ATC AGA ACT GGG GTA AGG 816 
Met Ser Lys Ala His Gly He Asp Pro Asn He Arg Thr Gly Val Arg 
260 265 270 

ACC ATT ACC ACA GGC GCC CCC GTC AGA TAC TCT ACC TAT GGC AAG TTT 864 
Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr Tyr Gly Lys Phe 
275 280 285 

CTT GCC GAT GGT GGT TGC TCT GGG GGC GCT TAT GAC ATC ATA ATA TGT 912 
Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He He He Cys 
290 295 300 

GAT GAG TGC CAT TCA ACT GAC TCG ACT ACA ATC TTG GGC ATC GGC ACA 960 
Asp Glu Cys His Ser Thr Asp Ser Thr Thr He Leu Gly He Gly Thr 
305 310 315 320 

GTC CTG GAC CAA GCG GAG ACG GCT GGA GCG CGG CTT GTC GTG CTC GCC 1008 
Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala 
325 330 335 

ACC GCT ACG CCT CCG GGA TCG GTC ACC GTG CCA CAC CCA AAC ATC GAG 1056 
Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn He Glu 
340 345 350 
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GAG GTG GCC CTG TCT AAT ACT GGA GAG ATC CCC TTC TAT GGC AAA GCC 1104 
Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe Tyr Gly Lys Ala 
355 360 365 

ATC CCC ATT GAA GCC ATC AGG GGG GGA AGG CAT CTC ATT TTC TGT CAT 1152 
He Pro He Glu Ala He Arg Gly Gly Arg His Leu He Phe Cys His 
370 375 380 

TCC AAG AAG AAG TGC GAC GAG CTC GCC GCA AAG CTG TCA GGC CTC GGA 1200 
Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Ser Gly Leu Gly 
385 390 395 400 

ATC AAC GCT GTG GCG TAT TAC CGG GGG CTC GAT GTG TCC GTC ATA CCA 1248 
He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro 
405 410 415 

ACT ATC GGA GAC GTC GTT GTC GTG GCA ACA GAC GCT CTG ATG ACG GGC 1296 
Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly 
420 425 430 

TAT ACG GGC GAC TTT GAC TCA GTG ATC GAC TGT AAC ACA TGT GTC ACC 1344 
Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val Thr 
435 440 445 

CAG ACA GTC GAC TTC AGC TTG GAT CCC ACC TTC ACC ATT GAG ACG ACG 1392 
Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr Thr 
450 455 460 

ACC GTG CCT CAA GAC GCA GTG TCG CGC TCG CAG CGG CGG GGT AGG ACT 1440 
Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly Arg Thr 
465 470 475 480 

GGC AGG GGT AGG AGA GGC ATC TAC AGG TTT GTG ACT CCG GGA GAA CGG 1488 
Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr Pro Gly Glu Arg 
485 490 495 

CCC TCG GGC ATG TTC GAT TCC TCG GTC CTG TGT GAG TGC TAT GAC GCG 1536 
Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala 
500 505 510 

GGC TGT GCT TGG TAC GAG CTC ACC CCC GCC GAG ACC TCG GTT AGG TTG 1584 
Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val Arg Leu 
515 520 525 

CGG GCC TAC CTG AAC ACA CCA GGG TTG CCC GTT TGC CAG GAC CAC CTG 1632 
Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu 
530 535 540 

GAG TTC TGG GAG AGT GTC TTC ACA GGC CTC ACC CAT ATA GAT GCA CAC 1680 
Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His He Asp Ala His 
545 550 555 560 

TTC TTG TCC CAG ACC AAG CAG GCA GGA GAC AAC TTC CCC TAC CTG GTA 1728 
Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr Leu Val 
565 570 575 

GCA TAC CAA GCC ACG GTG TGC GCC AGG GCT CAG GCC CCA CCT CCA TCA 1776 
Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser 
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580 585 590 

TGG GAT CAA ATG TGG AAG TGT CTC ATA CGG CTG AAA CCT ACG CTG CAC 1824 
Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro Thr Leu His 
595 600 605 

GGG CCA ACA CCC TTG CTG TAC AGG CTG GGA GCC GTC CAA AAT GAG GTC 1872 
Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu Val 
610 615 620 

ACC CTC ACC CAC CCC ATA ACC AAA TAC ATC ATG GCA TGC ATG TCG GCC 1920 
Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala Cys Met Ser Ala 
625 630 635 640 



GAC CTG GAG GTC GTT ACG TAG GAA TTC 
Asp Leu Glu Val Val Thr * Glu Phe 
645 

GGC CGC ACT CGA GCA CCA CCA CCA CCA 
Gly Arg Thr Arg Ala Pro Pro Pro Pro 
660 665 



(2) INFORMATION FOR SEQ ID NO: 121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



GAG CTC CGT CGA CAA GCT TGC 1968 
Glu Leu Arg Arg Gin Ala Cys 
650 655 

CCA CTG AGA TCC 2007 
Pro Leu Arg Ser 



(ii) MOLECULE TYPE: RNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 
GCUCGCCCGG GGAUCCUCUA GGAAUACACG UUCGAU 36 
(2) INFORMATION FOR SEQ ID NO: 122: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: RNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:122: 
CUAGAGGAUC CCCGGGCGAG CCCUAUAGUG AGUCGU 3 6 

(2) INFORMATION FOR SEQ ID NO: 123: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:123: 
GCTCGCCCGG GGATCCTCTA G 21 
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